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DESCRIPTION 

5 Protein C Production in Transgenic Animals 

BACKGROUND OF THE INVENTION 

Protein C in its activated £onn plays an 
important role in regulating blood coagulation. The 

0 activated protein C, a serine protease, inactivates 
coagulation Factors Va and Villa by limited proteolysis. 
The coagulation cascade initiated by tissue injury, for 
example, is prevented from proceeding in an unimpeded 
chain-reaction beyond the area of injury by activated 

5 protein C. 

Protein C is synthesized in the liver as a 
single chain precursor polypeptide which is subsequently 
processed to a light chain of about 155 amino acids (M^ « 
21,000) and a heavy chain of 262 amino acids (Mj. a40,G00). 

0 The heavy and light chains circulate in the blood as a 
two- chain inactive protein, or zymogen, held together by a 
disulfide bond. When a 12 amino acid residue peptide is 
cleaved from the amino terminus of the heavy chain portion 
of the zymogen in a reaction mediated by thrombin, the 

5 protein becomes activated. The N- terminal portion of the 
light chain contains nine y-carboxyglutamic acid (Gla) 
residues that are required for the calcium- dependent 
membrane binding and activation of the molecule. Another 
blood protein, referred to as "protein S", Is believed to 

0 accelerate the protein C- catalyzed proteolysis of Factor 
Va. 

Protein C has also been implicated in .the action 
of tissue-type plasminogen activator (Kisiel et al., 
Rehriny Tnet. Mitt. 2^:29-42, 1983) . Infusion of bovine 
5 activated protein C (APC) into dogs results in increased 
plasminogen activator activity (Comp et al., J- ciin. 
Invest . M:1221-1228, 1981). Other studies (Sakata et 
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al.. Prpc. Nat J Acad. Sci. nSA £2:1121-1125, 1985) have 

shovn that addition ot APC to cultured endothelial cells 
leads to a rapid, dose -dependent increase in :;ibrinolytic 
activity in the conditioned media, reflecting :.ncrease8 in 
5 the activity of both urokina^e-related and tissue-type 
plasRLinogen activators. APC treatment also results in a 
dose -dependent decrease in anti-activator activity. In 
addition. studies with monoclonal antibodies against 
endogenous APC (Snow et al., FASEB Abstracts, 1988} 

10 implicate APC in maintaining patency of arteries during 
fibrinolysis and limiting the extent of tissue infarct. 

Experimental evidence indicates that protein C 
may be clinically useful in the treatment of thrombosis. 
Several studies with baboon models of thrombosis have 

15 indicated that activated protein C in low doses will be 
effective in prevention of fibrin depositioa. platelet: 
deposition and loss of circulation (Gruber et al.. 
Hemps tasls and Thrombnaifi 121a: abstract :.512, 198B; 

Widrow et al., Pibrinolyaia 2 SUppl. 1: abstract 7, 1988; 
20 Griffin et al . , Thrnmh. Haemoftt^aflia £2: abstract 1512, 
1989) . 

In addition, exogenous activated protein C has 
been shown to prevent the coagulopathic and let:hal effects' 
of gram negative septicemia (Taylor et al., J - rA^n. 
25 Invest: . l£:918-925, 1987). Data obtained from studies 
with baboons suggest that activated protein C plays a 
natural role in protecting against septicemia. 

Until recently, protein C was purrified from 
clotting factor concentrates (Marlar et al . , Blood 
30 5^:1067-1072, 1982) or from plasma (Kisiel . J. Clin, 
invegh . fiA:761-769', 1979) and activated in vitro. 
However, the possibility that the resulting product could 
be contaminated with such infectious agents as hepatitis 
virus, cytomegalovirus, or human immunodeficiency virus 
35 (HIV) make the process unfavorable. 

While expression of protein C through 
recombinant means has been theoretically possible as the 
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genes for both human and bovine protein C are known 
<Foster et al., Proe. Mati . Acad, fiei. USA fi2:4673-4677, 
1985; Poster et al . , Prog. Watil. Acad Sel. VSA Al:4766- 
4770, 1984 and U.S. Patent 4,775,624}, it has been met 
5 with limited success. Expression of some vitamin K- 
dependent proteins, such as protein C in cultured cells, 
has not produced protein C that has been at both 
commercially valuable levels and biologically functional 
when activated (i.e. had anticoagulant activity (Grinnell 

10 et al., in Bruley and Crohn, eds., Prnr.Rin C and Relatftd 
Ant teoagulantia '29-53. Gulf Publishing, Houston, TX and 
Grinnell et al.. Bio/Technol. 5:1189-1192, 1987)). 
Transgenic expression of protein C has yielded somewhat 
higher levels of expression, but the recombinant protein's 

15 anticoagulant activity has still remained low, with less 
than 50* of the material having biological activity 
{Velander et al., Pnop. wati &gad. fini . PSA M : 12003- 
12007, 1992} . Therefore, there remains a need for 
producing protein C that is both expressed at high levels 

20 and has therapeutic value. 

SUMMARY OF THE INVENTIOK 

It is an object of the present invention to 
provide methods for producing protein C in transgenic 

25 animals. It is a further object to provide transgenic 
animals that express human protein C in a mammary gland. 

Within one aspect. the present invention 
provides methods for producing protein C in a transgenic 
animal comprising (a) providing a DNA construct comprising 

30 a first DNA segment encoding a secretion signal and a 
protein C propeptide operably linked to a second DNA 
segment encoding protein C, wherein the encoded protein C 
comprises a two-chain cleavage site modified from Lys-Arg 
to R1-R2-R3-R4, and wherein each of R1-R4 is individually 

35 Lys or Arg, and wherein said first and second segments are 
operably linlced to additional DNA segments required for 
expression of the protein C DNA in a lactating mammary 
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gland of a host female animal; (b) Introducing said DMA 
construct into a fertilized egg of a non-human mammalian 
species; (c) inserting said egg into an oviduct or uterus 
of a female of said species to obtain offspring carrying 
5 said DHA construct; (d) breeding said offspring to produce 
female progeny that express said first and second BVIA 
segments and produce milk containing protein C encoded by 
said second segment, wherein said protein has 
anticoagulant activity ' upon activation; (e) collecting 

10 milk from said female progeny; and (f) rectsvering the 
protein C from the milk. In one embodiment, Rx-R2-R3-'R4 
ia Arg-Arg-Lys-Arg (SEQ ID NO: 20) . In another 

embodiment, the method further comprises the step of 
activating the protein C. In another embodiment, the non- 

15 human mammalian species is selected from sheep, rabbits, 
cattle and goats. In another embodiment each cf the first 
and second DNA segments comprises an intron. In another 
embodiment, the second DNA segment comprises a DNA 
sequence of nucleotides as shown in SEQ ID NO: 1 or SEQ ID 

20 NO:3. In another embodiment, the additional DNA segments 
comprise a transcriptional promoter selected from the 
group consisting of casein, p-lactoglooulin, a- 
lactoglobulin, a-lactalbumin and whey acidic protein gene 
promoters . 

25 In another aspect, the present invention 

provides a transgenic non-human female m-immal that 
produces recoverable amounts of human protein C in its 
milk, wherein at least 90\ of the human prote:.n C in the 
milk is two-chain protein C. 

30 In another aspect, the present invention 

provides a process for producing a transgenic offspring of 
a mammal comprising the steps of (a) providing a DNA 
construct comprising a first DNA segment encoding a 
secretion signal and a protein C propeptide oparably 

35 linked to a second DNA segment encoding protein C. wherein 
the encoded protein C comprises a two- chain cleavage site 
modified from Lys-Arg to R1-R2-R3-R4* and wher-sin each of 
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R1-R4 is individually Lys or Arg, and wherein said first 
and second segments are operably linked to additional DMA 
segments required for expression of the protein C DNA in a 
lactating mammary gland of a host female animal; (b) 
5 introducing said DNA construct into a fertilized egg of a 
non-human mammalian species; and (c) inserting said egg 
into an oviduct or uterus of a female of said species to 
obtain offspring carrying said DNA construct. 

Within another aspect, the present invention 

10 provides non-human mammals produced according to the 
process for producing a transgenic offspring of a mammal 
comprising the steps of (a) providing a DNA construct 
comprising a first DNA segment encoding a secretion signal 
and a protein C propeptide operably linked to a second DNA 

15 segment encoding protein C, wherein the encoded protein C 
comprises a two- chain cleavage site modified from Lys-Arg 
to R1-R2-R3-R4. and wherein each of R1-R4 is individually 
Lys or Arg, and wherein said first and second segments are 
operably linked to additional DNA segments required for 

20 expression of the protein C DNA in a lactating mammary 
gland of a host female animal; (b) introducing said DNA 
construct into a fertilized egg of a non-human mammalian 
species; and tc) inserting said egg into an oviduct or 
uterus of a female of said species to obtain offspring 

25 carrying said DNA construct . 

In another aspect, the present invention 
provides a non-human mammalian embryo containing in its 
nucleus a heterologous DNA segment encoding protein C, 
wherein the encoded protein C comprises a two-chain 

30 cleavage site modified from Lys-Arg to R3^-R2-R3-R4, and 
wherein each of R1-R4 is individually Lys or Arg. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates analysis of plasma -derived 
35 and transgenic protein C run under non-reducing and 
reducing conditions. Lane 1 is plasma -derived protein C 
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and lane 2 is transgenic protein C from the milk, of sheep 
30851. 

Figure 2 illustrates sequencing of protein c 
from sheep line 30851. The initial yields were 
5 prosequence«9 pmol, light chainB563 pmol and heavy 
chain«5€5 pmol. 

Figure 3 illustrates clotting aztivity of 
transgenic protein C compared to plasma-derived protein C. 

10 DETAILED DESCRIPTION OF THE INVENTION 

Prior to setting forth the invencion in detail, 
it will be helpful to define certain terms used herein: 

As used herein, the term "biologically active" 
is used to denote protein C that is characterized by its 

15 anticoagulant and fibrinolytic properties. Protein C, 
when activated, inactivates factor Va and factor Villa in 
the presence of -phospholipid and calciiim. Activated 
protein C also enhances fibrinolysis, an effect believed 
to be mediated by the lowering of the levels of 

20 plasminogen activator inhibitors. As stated previously, 
cwo-chain protein C is activated upon cleavage of a 12 
amino acid peptide from the amino terminus oJ: the heavy 
chain portion of the zymogen. 

The term "egg** is used to denote an unfertilized 

25 ovum, a fertilized ovum prior to fusion of the pronuclei 
or an early stage embryo (fertilized ovum with fused 
pronuclei) . 

A "female mammal that produces milk containing 
biologically active protein C" is one that, following 

30 pregnancy and delivery, produces, during th«! lactation 
period, milk containing, recoverable amounts of protein C 
that can be activated to be biologically act:.ve. Those 
skilled in the art will recognized that such animals will 
naturally produce milk, and therefore the protein C, 

3 5 discontinuous ly . 

The term "progeny" is used in its usual sense to 
include offspring and descendants. 
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The term "hecerologoue" is used to denote 
genetic material originating from a different epeciea than 
that into which it has been introduced, or a protein 
produced from such genetic material . 
5 Within the present invention, transgenic animal 

technology is employed to produce protein C within a 
mammary gland of a host female mammal. Expression in the 
mammary gland and subsequent secretion of the protein of 
interest into the milk overcomes many difficulties 

10 encountered in isolating proteins from other sources. 
Milk is readily collected, available in large quantities, 
and well characterized biochemically. Furthermore, the 
major milk proteins are present in milk at high 
concentrations (from about 1 to 16 g/1) . 

15 From a commercial point of view, it is clearly 

preferable to use as the host a species that has a large' 
milk yield. While smaller animals such as mice and rats 
can be used {and are preferred at the proof -of -concept 
stage) , within the present invention it is preferred to 

20 use livestock mammals including sheep and cattle. Sheep 
are particularly preferred due to such factors as the 
previous history of transgenesis in this species, milk 
yield, generation time, cost and the ready availability of 
equipment for collecting sheep milk. It is generally 

25 desirable to select a breed of host animal that has been 
bred for dairy use, such as East Friesland sheep, or to 
introduce dairy stock by breeding of the transgenic line 
at a later date. In any event, animals of known, good 
health status should be used. 

30 Cloned DNA sequences encoding human protein C 

have been described (Foster and Davie, PrpC. Natl. Acad. 
SfTi . ITS A ai:4766-4770, 19B4; Foster et al., PrOC. Natl, 
fipad. USA £2:4673-4677, 1985; and Bang et al . , U.S. Patent 
4,755,624, each incorporated herein by reference). 

35 Complementary cDNAs encoding protein C can be obtained 
from libraries prepared from liver cells of various 
mammalian species according to standard laboratory 
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procedures. DM^a from other species, such as v.he protein 
C encoded by rats, pigs, sheep, cows and primates can be 
used and can be identified using probes from human cDNA. 

In a preferred embodiment, human genomic DNAs 
5 encoding protein C are used. The human protein C gene is 
composed of nine exons ranging in size from 25 to 885 
nucleotides, and seven introns ranging in size from 92 to 
2668 nucleotides (U.S. Patent 4,959,318, incorporated 
herein by reference) . The first exon is non- coding and 

10 referred to as exon O. Exon I and a portion of exon II 
code for the 42 amino acid signal sequence and propeptide 
(i.e., pre-propeptide) - The remaining portion cif exon II, 
exon III, exon iv, exon V and a portion of exan VI code 
for the light chain of protein C. The remaining portion 

15 of exon VI, exon VII and exon VIII code for the heavy 
chain of protein C. A representative human genomic DNA 
sequence and corresponding amino acid sequence are shown 
in SEQ ID NOS: 1 and 2, respectively. A representative 
human protein C cDNA sequence and corresponding amino acid 

20 sequences are shown in SEQ ID NO: 3 and 4, respectively. 

Those skilled in the art will reccjnize that 
naturally occurring allelic variants of these sequences 
will exist; that additional variants can be generated by 
amino acid substitution, deletion, or insertion; and that 

25 such variants are useful within the present invention. In 
general, it is preferred that any engineered variants 
comprise only a limited number of amino acid 
substitutions, deletions, or insertions, and that any 
substitutions are conservative- Thus, it is preferred to 

30 produce protein C polypeptides that are at least 90%, and 
more preferably at least 95% or more identical in sequence 
to the corresponding native protein. 

Within the present invention, the proteolytic 
processing involved in the maturation of i-eeombinzmt 

35 protein C from single chain form to the two-rhain form 
(i.e., cleaved between the light chain and the heavy 
chain) has bean enhanced by modifying the amino acid 
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sequence around the two-chain cleavage site. In the 
normal situation, endoproteolytic cleavage of the 
precursor molecule at the Argi57-Asp258 bond and the 
removal o£ the dipeptide Iiysis6-^9i57 ^ 
5 carboxypeptidase activity generate the light and heavy 
chains o£ protein C prior to secretion. Expression of 
protein C with the native (Lys-Arg) tw-chain cleavage 
site produces protein C that may contain up to 40% or more 
uncleaved, single-chain protein C (Grinnel et al., in 

10 Protein C and Related Anticoagulants, eds., Bruley and 
Drohan, Gulf, Houston, pp. 29-63, 1990; Suttie, Thremh . 
Rea. 11:129-134, 1986 and Yan et al.. Trends Bioehew. Sci 
11:264-268, 1989) . The single-chain form of protein C may 
not be able to be activated. The cleavage site may be in 

15 the form of the amino acid sequence R1-R2-R3-R4, wherein 
each of Rl through R4 is individually lysine (Lys) or 
arginine (Arg) . Particularly preferred sequences include 
Arg-Arg- Lys-Arg (SEQ ID NO: 20) and Lys-Arg -Lys-Arg (SEQ 
ID NO: 21) . 

20 In a preferred embodiment, the present invention 

provides for recoverable amounts of human protein C in the 
milk of a non-human mammal, where at least 90%, preferably 
at least 95%, of the human protein C is two -chain protein 
C. 

25 To obtain expression in the mammary gland, a 

transcription promoter from a milk protein gene is used. 
Milk protein genes include those genes encoding caseins, 
beta-lactoglobulin (BLG) , a-lactalburain, and whey acidic 
protein. The beta-lactoglobulin promoter is preferred. 

30 In the case of the ovine beta-lactoglobulin gene, a region 
of at least the proximal 406 bp of 5 ■ flanking sequence of 
the ovine BLG gene (contained within nucleotides 3844 to 
4257 of SEQ ID NO: 5) will generally be used. Larger 
portions of the 5* flanking sequence, up to about 5 kb, 

35 are preferred, A larger DNA segment encompassing the 5' 
flanking promoter region and the region encoding the 5' 
non- coding portion of the beta-lactoglobulin gene 
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(contained within nucleotides 1 to 4257 of SEO ID NO: 5) 
is particularly preferred. See Whitelaw et al., BieeheTn 
i7, 266 ; 31-39. 1992. Similar fragments of promoter DNA 
from other species are also suitable. 
5 Other regions of the beta-lactoglobulin gene may 

also be incorporated in constructs, as may genc-mic regions 
of the gene to be expressed. It is generally accepted in 
the art that constructs lacking introns. for example, 
express poorly in the transgenic lactating mammary gland 

10 in comparison with those constructs that contain introns 
(see Brinster et al., Proe. Natl. Aead. Sei . OSR 836- 
B40. 1988; Palmiter et al., Proc. Natl . Acad. S ei. QSA fifi: 
478-482. 1991; V^hitelaw et al., Tranagenie Ren . X: 3-13. 
1991; HO B9/01343; HO 91/02318). In this regard, it is 

15 generally preferred. where possible, to use genomic 
sequences containing all or some of the native introns of 
a gene encoding protein C. Hithin certain embodiments of 
the invention, the further inclusion of at least some 
introns from the' beta-lactoglobulin gene is preferred. 

20 One such region is a DNA segment which provides for intron 
splicing and RHA polyadenylation from the 3' non- coding 
region of the ovine beta-lactoglobulin gene. Hhen 
substituted for the natural 3' non-coding sequences of a 
gene. this ovine beta-lactoglobulin segment can both 

25 enhance and stabilize expression levels of the protein C. 

For expression of protein C. DNA segments 
encoding protein C are operably linked to additional DNA 
segments required for their expression to produce 
expression units. One such additional aegmi^nt is the 

30 above-mentioned milk protein gene promoter. Sequences 
allowing for termination of transcription and 
polyadenylation of mRNA may also be incorporated. Such 
sequences are well known in the art. for example, one such 
termination sequence is the "upstream mouse sequence" 

35 (McGeady et al.. Q£Z& 5:289-298.1986). The expression 
units will further include a DNA segment encoding a 
secretion signal operably linked to the segmait encoding 
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the protein C polypeptide chain. The secretion signal may 
be a native protein C secretion signal or may be that of 
another protein, such as a milk protein. The term 
"secretion signal" is used herein to denote that portion 
5 of a protein that directs it through the secretory pathway 
of a cell to the outside. Secretion signals are most 
commonly found at the amino termini of proteins. See. for 
example, von Heinje, Nuc. Aftirin h^s j^. 4683-4690. 1986; 
and Meade et al.. U.S. Patent No. 4,873,316, which are 
10 incorporated herein by reference. 

Construction of expression units is conveniently 
carried out by inserting a protein C sequence into a 
plaamid or phage vector containing the additional DNA 
segments, although the expression unit may be constructed 

15 by essentially any sequence of ligations. It is 
particularly convenient to provide a vector containing a 
DNA segment encoding a milk protein and to replace the 
coding sequence for the milk protein with that of a 
protein C (including a secretion signal) , thereby creating 

20 a gene fusion that includes the expression control 
sequences of the milk protein gene. In any event, cloning 
of the expression units in plasmids or other vectors 
facilitates the amplification of the protein C sequences. 
Amplification is conveniently carried out in bacterial 

25 (e.g. B. coll) host cells, thus the vectors will typically 
include an origin of replication and a selectable marker 
functional in bacterial host cells. 

The expression unit is then introduced into 
fertilized eggs (including early-stage embryos) of the 

30 chosen host species. Introduction of heterologous DNA can 
be accomplished by one of several routes, including 
pronuclear microinjection (e.g. U.S. Patent No. 
4.873,191), retroviral infection (Jaeniach, Seiencf^ 2AQ : 
146B-1474, 1988) or site-directed integration using 

35 embryonic stem (ES) cells (reviewed by Bradley et al., 
Blfl /Techno Iflgy XQ.-- 534-539, 1992) . The eggs are then 
implanted into the oviducts or uteri of pseudopregnant 
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females and allowed to develop to term. Offspring 
carrying the introduced DNA in their germ lirte can pass 
the DHA on to their progeny in the normal, Mendelian 
fashion, allowing the development of transgenic herds. 
5 General procedures for producing transgenic . animals are 
known in the art. See, for example, Kog«ji et al 

Manipulating the Mouse Embrvo: A LabQfflCPry Manual. Cold 

Spring Harbor Laboratory, 1986; Simons et al . , 
Bio/TeGhnoloQV fi: 179-183, 1S88; Wall et al., Biol. 

10 Reprod. 32 : 645-651, 1985; Buhler et al-, Bio/Teghnolo<Ty 
a: 140-143, 1990; Ebert et al . , ain/Terhnolf^av .a: 835-838, 
1991; Krimpenfort et al . , HinyTPghnQlngy q> 844-847, 1991; 
Wall et al., ^^ ri>^^ R-inrh^m. 113-120, 199 2; and WIPO 
publications WO 88/00239, WO 90/05168, WO 92/11757; and GB 

15 87/00458, which are incorporated herein by reference. 
Techniques for introducing foreign DNA sequences into 
mammals and their germ cells were originally disveloped in 
the mouse. See, e.g., Gordon et al., Proc. Hatl. Acad. 
fiei ■ USA 22: 73B0-73B4, 1980; Gordon and Ruddle, Seienee 

20 214 : 1244-1246, 1981; Palmiter and Brlnster,' 41: 343- 

345, 1985; Brinster et al . , Proe. Warl . Rrari. Stri . USA fl2: 
4438-4442, 1985; and Hogan et al. {ibid.}. These 
techniques were subsequently adapted for use vfith larger 
animals, including livestock species (see e.g., WIPO 

25 publications WO 88/00239, WO 90/05188, and WC 92/11757; 
and Simons et al., Bio/Technplogry fi: 179-183, 1988). To 
summarize, in the moat efficient route used to date in the 
generation of transgenic mice or livestock, several 
hundred linear molecules of the DNA of interest are 

30 injected into one of the pro-nuclei of a fertilized egg. 
Injection of DNA into the cytoplasm of a zygote can also 
be employed. 

In general, female animals are superovulated by 
treatment with follicle stimulating hormone, then mated, 
35 Fertilized eggs are collected, and the heterologous DNA is 
injected into the eggs using known methods. See, for 
example, U.S. Patent No. 4,873,191; Gordon et al . , Proc. 
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NaM. Aead. Sci . USA 22: 7380-7384, 1980; Gordon and 
Ruddle, Sei ence 22Ai 1244-1246, 1981; Palraiter and 
Brineter, Cell Hi 343-345, 1985; Brinster et al., PrQC. 
|jnH arflri. flffi ■ iTflA fi^ = 4438-4442, 1985; Hogan et al,, 
5 yj!aT.4pinat'-itiy rhff m^ha^ RmbryQi A Lahorafcory Manual. Cold 
Spring Harbor Laboratory, 1986; Simons et al. 
Blo/Teehnoloav 179-183, 1988; Wall et al . , Biol . 

Reprod. 12: 645-651, 1985: Buhler et al . , Big /Technology 
A: 140-143, 1990; Ebert ec al., Bl o/TftC;hnQ3 OSfY SH 835-838, 
10 1991; Krimpenfort et al.. Big /Technology li 844-847, 1991; 
Wall et al., J, Cell. BlOChem. IS: 113-120, 1992; WIPO 
publications WO 88/00239, WO 90/05118, and WO 92/11757; 
and GB 87/00458, which are incorporated herein by 
reference . 

IS For injection into fertilized eggs, the 

expression units are removed from their respective vectors 
by digestion with appropriate restriction enzymes. For 
convenience, it is preferred to design the vectors so that 
the expression units are removed by cleavage with enzymes 

20 that do not cut either within the expression units or 
elsewhere in the vectors. The ea^ression units are 
recovered by conventional methods, such as electro-elution 
followed by phenol extraction and ethanol precipitation, 
sucrose density gradient centrif ugation, or combinations 

25 of these approaches. 

DNA is injected into eggs essentially as 
described in Hogan et al., ibid. In a typical injection, 
eggs in a dish of an embryo culture medium are located 
using a stereo zoom microscope (x50 or x63 magnification 

30 preferred). Suitable media include Hepes {N-2- 

hydroxyethylpiperazine-N' -2-ethanesulphonic acid) or 
bicarbonate buffered media such as M2 or M16 (available 
from Sigma Chemical Co., St. Louis, USA) or synthetic 
oviduct medium (disclosed below) . The eggs are secured 

35 and transferred to the center of a glass slide on an 
injection rig using, for exan^le, a drumroond pipette 
complete with capillary tube. Viewing at lower {e.g. x4) 
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magnification is used at this stage. Using the holding 
pipette of the injection rig. the eggs are; positioned 
centrally on the slide. Individual eggs are sequentially 
secured to the holding pipette for injection. For each 
5 injection process, the holding pipette/egg ia positioned 
in the center of the viewing field. The injection needle 
is then positioned directly below the egg. Preferably 
using x40 Nomarski objectives, both manipulator heights 
are adjusted to focus both the egg and the r.eedle. The 

10 pronuclei are located by rotating the egg and adjusting 
the holding pipette assembly as necessary. Once the 
pronucleus has been located, the height of the manipulator 
is altered to focus the pronuclear membrane. The 
injection needle is positioned below the egg such that the 

15 needle tip is in a position below the ceriter of the 
pronucleus. The position of the needle is then altered 
using the injection manipulator assembly tc> bring the 
needle and the pronucleus into the same focal plane. The 
needle is moved, via the joy stick on the injection 

20 manipulator assembly, to a position to the right of the 
egg. With a short, continuous jabbing movement, the 
pronuclear membrane is pierced to leave the needle tip 
inside the pronucleus. Pressure is applied to the 
injection needle via, for example, a glass s^^ringe until 

25 the pronucleus swells to approximately twice its volume. 
At this point, the needle is slowly removed. ,*Xeverting to 
lower (e.g. x4) magnification, the injected egg is moved 
to a different area of the slide, and the process is 
repeated with another egg. 

30 After the DNA is injected, the eggs may be 

cultured to allow the pronuclei to fuse, producing one- 
cell or later stage embryos. In general, the eggs are 
cultured at approximately the body temperature of the 
species used in a buffered medium containing balanced 

3S salts and serum. Surviving embryos are then transferred 
to pseudopregnant recipient females, typically by 
inserting them into the oviduct or uterus, ancl allowed to 
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develop to term. During embryogenesis, some of the 
injected DNA integrates in a random fashion in the genomes 
of a small number of the developing embryos. 

Pocential transgenic offspring are screened via 
5 blood samples and/or tissue biopsies. DNA is prepared 
from these samples and examined for the presence of the 
injected construct by techniques such as polymerase chain 
reaction (PCR; see Mullis, U.S. Patent No. 4,683^202) and 
Southern blotting (Southern, .t. Mr>i Rial. 2a:503, 1975; 

10 Maniatis et al., Molecular cloning- & Laborat^ory Manual, 
Cold Spring Harbor Laboratory. 1982) . Foxxnder transgenic 
animals, or GOs, may be wholly transgenic, having 
transgenes in all of their cells, or mosaic, having 
transgenes in only a subset of cells (see, for example, 

15 Wilkie et al., Dgvelop. Biol. Olfi: 9-18, 1986). In the 
latter case, groups of germ cells may be wholly or 
partially transgenic. In the latter case, the number of 
transgenic progeny from a founder animal will be less than 
the expected 50% predicted from Mendelian principles. 

20 Founder GO animals are grown to sexual maturity and mated 
to obtain offspring, or Gls. The Gls are also examined 
for the presence of the transgene to demonstrate 
transmission from founder GO animals. In the case of male 
GOs, these may be mated with several non- transgenic 

25 females to generate many offspring. This increases the 
chances of observing transgene transmission. Female GO 
founders may be mated naturally, artificially inseminated 
or superovulated to obtain many eggs which are transferred 
to surrogate mothers. The latter course gives the best 

30 chance of observing transmission in animals having a 
limited number of young. The above-described breeding 
procedures are used to obtain animals that can pass the 
DNA on to subsequent generations of offspring in the 
normal, Mendelian fashion, allowing the development of, 

3 5 for example, colonies (mice), flocks (sheep), or herds 
(pigs, goats and cattle) of transgenic animals. 
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The milk from lactating 60 and Gl females is 
examined for the expression of the heterolocjous protein 
using inununological techniques such as ELISA (see Harlow 
and Lane, Antibodies. A Laboratory Manual. Cold Spring 
5 Harbor Laboratory, 1988) and Western blotcinc (Towbin et 
al., Prnr WafT . AraH Sri. llSA lGi 4350-4354, 1979). For 
a variety of reasons known in the art, expression levels 
of the heterologous protein will be expected to differ 
between individuals. 

10 A satisfactory family of animals should satisfy 

three criteria: they should be derived from the same 
founder GO animal; they should exhibit stable transmission 
of the transgene; and they should exhibit acceptably 
stable expression levels from generation to generation and 

15 from lactation to lactation of individual aninals. These 
principles have been demonstrated and discussed (Carver et 
al., BiQ/Teghnolocrv 11: 1263-1270, 1993). Animals from 
such a suitable family are referred to as a "line.*" 
Initially, male animals, GO or Gl, are used to derive a 

20 floclc or herd of producer animals by natural or artificial 
insemination. In this way, many female animals containing 
the same transgene integration event can be quickly 
generated from which a supply of milk can be obtained. 

The protein C is recovered from milk using 

25 standard practices such as skimming, precipitation, 
filtration and protein chromatography techniques. 

Protein C produced according to the present 
invention can be activated by removal of the: activation 
peptide from the amino terminus of the heavy chain. 

30 Activation can be achieved using methods th.at are well 
known in the art, for example, using a-thrombin (Marlar et 
al., BioQd 5.4:1067-1072, 1982), trypsin (Marlar et al., 
1982, ibid.), Russel's viper venom factor X activator 
(Kisiel, J. riin. Tnvt>at. £4.:761-769, 1979) or 

35 commercially available Protac C (American Diagoostica, NY, 
NY) . 
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The protein C molecules provided by the present 
invention and pharmaceutical compositions thereof are 
particularly useful for administration to humans to treat 
a variety of conditions involving intravascular 
5 coagulation. For instance « although deep vein thrombosis 
and pulmonary embolism can be treated with conventional 
anticoagulants, the activated protein C described herein 
may be used to prevent the occurrence of thromboembolic 
complications in identified high risk patients, such as 

10 those undergoing surgery or those with congestive heart 
failure. Since activated protein C is more selective than 
heparin, being active in the body generally when and where 
thrombin Is generated and fibrin thrombi are formed « 
activated protein c will be more effective and less likely 

15 to cause bleeding complications than heparin when used 
prophylactically for the prevention of deep vein 
thrombosis. The dose of activated protein C for 
prevention of deep vein thrombosis is in the range of 
about 100 >Ag to 100 mg/day, and administration should 

20 begin at least about 6 hours prior to surgery and continue 
at least until the patient becomes ambulatory. In 
established deep vein thrombosis and/or pulmonary 
embolism, the dose of activated protein C ranges from 
about 100 M^g to 100 mg as a loading dose followed by 

25 maintenance doses ranging from 3 to 300 mg/day. Because 
of the lower likelihood of bleeding complications from 
activated protein C infusions, activated protein C can 
replace or lower the dose of heparin during or after 
surgery in conjunction with thrombectomies or 

30 embolectomies . 

The activated protein C compositions of the 
present invention will also have substantial utility in 
the prevention of cardiogenic emboli and in the treatment 
of thrombotic strokes. Because of its low potential for 

35 causing bleeding complications and its selectivity, 
activated protein C can be given to stroke victims and may 
prevent the extension of the occluding arterial thrombus. 
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The amount of activated protein C administeriid will vary 
with each patient depending on the nature and severity of 
the stroke, but doses will generally be in the range of 
those suggested below. 
5 Pharmaceutical compositions of activated protein 

C provided herein will be a useful treatment in acute 
myocardial infarction because of the ability 3f activated 
protein C to enhance in vitro fibrinolysis. Activated 
protein C can be given with tissue plasminogisn activator 

10 or streptokinase during the acute phases of the tnj^cardial 
infarction. After the occluding coronary thrombus is 
dissolved, activated protein C can be given for subsequent 
days or weeks to prevent coronary reocculsion. In acute 
myocardial infarction, the pacient is given a loading dose 

15 of at least about 1-500 mg of activated protein C, 
followed by maintenance doses of 1-100 mg/day. 

Activated protein C is useful in the treatment 
of disseminated intravascular coagulation (DIC* . Patients 
with DIC characteristically have widespread 

20 microcirculatory thrombi and often severe bleeding 
problems which result from consumption o:: essential 
clotting factors. Because of its selectivity, activated 
protein C will not aggravate the bleeding problems 
associated with DIC, as do conventional anticoagulants, 

25 but will retard or inhibit the formation of additional 
microvascular fibrin deposits. 

The invention is further illustrated by the 
following non- limiting examples. 

30 EXAMPLES 

Bxample I 

A. V*>rfnr pMADfi rnnar.mrMon 

The multiple cloning site of the vector pUClB 
35 {Yanisch- Perron et al., Qfiiift 11:103-119, 1985) was removed 
and replaced with a synthetic doubli* stranded 
oligonucleotide (the strands of which are sho«-n in SEQ ID 
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KO: 6 and SEQ ID NO: 7) containing the restriction sites 
Pvu l/Mlu I/Eco RV/Xba I/Pvu I/httu I, and flanked by S' 
overhangs compatible with the restriction sites Eco RI and 
Hind III. pUClB was cleaved with both Eco RI and Hind 
5 III, the 5' terminal phosphate groups were removed with 
calf intestinal phosphatase, and the oligonucleotide was 
ligated into the vector backbone. The DNA sequence across 
the junction was confirmed by sequencing, and the new 
plasmid was called pUCPM. 

10 The b-lactoglobulin (BLO) gene sequences from 

pSSltgXS (disclosed in WIPO publication MO 88/0023 9) were 
excised as a Sal I-Xba I fragment and recloned into the 
vector pUCPM that had been cut with Sal I and Xba I to 
construct vector pUCXS. pUCXS is thus a pUClS derivative 

15 containing the entire BUS gene from the Sal I site to the 
Xba I site of phage SSI (Ali and Clark, J. Mol. Biol. 
415-426, 1968) . 

The plasmid pSSitgSE (disclosed in WIPO 
publication WO 68/00239) contains a 1290 bp BLG fragment 

20 flanked by Sph I and EcoR I restriction sites, a region 
spanning a unique Not I site and a single Pvu II site 
which lies in the 5' untranslated leader of the BLG mRNA. 
Into this Pvu II site was ligated a double stranded, 8 bp 
DHA linker (5 ' -GGATATCC-3 ' ) encoding the recognition site 

25 for the enzyme Eco RV. This plasmid was called 
pSSltgSE/RV, DNA sequences bounded by Sph I and Not I 
restriction sites in pSSltgSE/RV were excised by enzymatic 
digestion and used to replace the equivalent fragment in 
pUCXS. The resulting plasmid was called pTJCXSRV. The 

30 sequence of the BLG insert in pUCXSRV is shown in SEQ ID 
NO: 5, with the unique Eco RV site at nucleotide 4245 in 
the 5' untranslated leader region of the BLG gene. This 
site allows insertion of any additional DNA sequences 
under the control of the BLG promoter 3* to the 

35 transcription initiation site. 

Using the primers BLGAMP3 (5'-TGG ATC CCC TGC 
CGG TGC CTC TGG-3'i SEQ ID NO: 8) and BLGAMP4 (S'-AAC GCG 
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TCA TCC TCT GTG AGO CAG-3 ' ; SEQ ID NO: 9) a l?cm fragment 
of approximately 650 bp was produced from sequences 
innnediately 3 ' to the stop codon of the ELG gene in 
pUCXSRV. The PCR fragment was engineered to have a BaioH I 
5 site at its 5* end and an Mlu I site at its 3' end and was 
cloned as such into BamH I and Mlu I cut pGEM7zf(-<-) 
(Promega) to give pDAM200(+). 

pUCXSRV was digested with Kpn I, and the 
largest, vector containing band was gel purified. This 

10 band contained the entire pUC plasmid sequences and some 
3' non-coding sequences from che BLG gene. Into this 
backbone was ligated the small Kpn I fragment from 
pDAM200(+) which, in the correct orientation, effectively 
engineered a Bara HI site at the extreme 5* end of the 2.6 

15 Kbp of the BLG 3* flanking region. This plasmid was 
called pBLAC200. A 2.6 Kbp Cla I-Xba I fragment from 
pBLAC200 was ligated into Cla I-Xba I cut pSP72 vector 
(Promega) , thus placing an Eco RV site immediately 
upstream of the BLG sequences. This plasmid was called 

20 PBLAC210. 

The 2.6 Kbp Eco RV-Xba I fragment from pBLAC210 
was ligated into Eco RV-Xba 1 cut pUCXSRV to form pMAD6 
(SEQ ID NO: 23) . This, in effect, excised all coding and 
intron sequences from pUCXSRV, forming a BYjG minigene 

25 consisting of 4.2 Kbp of S* promoter and 2.6 Kbp of 3' 
downstream sequences flanking a unique Eco RV site. An 
oligonucleotide linker (ZC6B39: ACTACGTAGT; SEC ID NO: 10> 
was inserted into the Eco RV site of pKAD6 (SEQ ID NO: 
23) . This modification destroyed the Eco HV site and 

30 created a Sna BI site to be used for cloning purposes. 
The vector was designated pNAD6-Sna. Me£senger RNA 
initiates upstream of the Sna BI site and terminates 
downstream of the Sna BI site. The precursor transcript 
will encode a single BLG-derived intron, intron 6, which 

35 is entirely within the 3' untranslated region of the gene. 
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B. Tnr.rnnleafl Vector pMAD 

The beta-lactoglobulin cloning vector prfAD was 
also constructed to allow the insertion of cDNAs under the 
control o£ the beta-lactoglobulin gene promoter in 
5 constructs containing no introne. To generate pMAD, the 
plasmid pBLAClQO was opened by digestion with both Bco RV 
and Sal I. The vector fragment was gel purified and the 
linearized vector ' was ligated with the 4.2 )cb promoter 
fragment from the plasmid pUCXSRV as a Sal I-Eco RV 

10 fragment. The resulting construct was designated pSTl and 
constitutes a beta-lactoglobulin mini-gene enconipassing a 
4.2 kb of promoter region and 2.1 kb of 3' non-coding 
region beginning immediately ■ downstream of the beta- 
lac toglobuling translational termination codon. A unique 

15 Bco RV site allows blunt -end cloning of any additional DNA 
sequences. To generate transgenic animals it is generally 
accepted in the art and preferred to separate bacterial 
plasmid vector sequences from those intended to be used in 
the generation of transgenic animals. In order to allow 

20 the practical excision of novel cDNA based constructs 
using this beta-lactoglobulin mini -gene, the minigene was 
excised from pSTl on a Xho I -Not I fragment « the DNA 
termini made flush with Klenow polymerase and the product 
was ligated into the Eco RV site of pUCPN to yield pMAD. 

25 Digestion with Mlu I liberates beta-lactoglobulin-cDKA 
constructs from the bacterial vector backbone. 

Intronless constructs based on cDNAs and vectors 
such as pMAD benefit from the use of "rescue technology" 
for efficient expression. Rescue technology takes 

30 advantage of the 'ability of a co-injected and co- 
integrated BLG gene to improve the expression levels 
obtained from intronless, cDNA-based constructs in the 
transgenic system. Rescue technology is disclosed in HIPO 
publication WO 92/11358 « and is incorporated herein by 

35 reference. 
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Example ^ 

A. iBQlatlQH Ot CDNA 

A cDNA sequence coding for human protein C was 
prepared as described in U.S. Patent 4,959,31B, which is 
5 incorporated herein by reference. Briefly, a genomic 
fragment containing an exon corresponding to amino acids - 
42 to -19 (SEQ ID NO: 1) of the pre-pro peptide of protein 
C was isolated, nick translated and used as a probe to 
screen a cDNA library constructed by the technique of 

10 Gubler and Hoffman^ QQn& 2^:263-269, 1983, using mRNA from 
HepG2 cells. This cell line was derived from human 
hepatocytes and was previously shown to synthetiize protein 
C (Fair and Bahnak, filood £1:194-204, 1984} . Positive 
clones comprising cDNA inserted into the £co RI site of 

15 phage A.gtll were isolated and screened with an 
oligonucleotide probe corresponding to the 5' non-coding 
region of the protein C gene. One clone was al.so positive 
with this probe and its entire nucleotide sequence was 
determined. The cDNA contained 70 bp of 5* untranslated 

20 sequence, the entire coding sequence for huiian prepro- 
protein C, and the entire 3' non-coding region 
corresponding to the second polyadenylation site. 

B. Subcloninff q£ Protean C cDNA 

The vector pDX was derived from pD3, which was 
25 generated from plasmid pDHFRIII (Berkner et al . , Nue. 
Aeids Res. l3!B4i-flS7. 1985). The Pst I site immediately 
upstream from the DHPR sequence in pDHFRIII was converted 
to a Bel I site by digestion with Pst I. The DNA was 
phenol extracted, ethanol precipitated and resuspended in 
30 buffer B (50 mM Tris pH 8, 7 mM MgCl2. ^ mM p-NSH) . A 
ligation reaction containing the linearized plasmid DNA 
and Bel I linkers was done. The resulting plasmid was 
phenol extracted, ethanol precipitated and digested with 
Bel I and gel purified. The gel purified plasmid DNA was 
35 circularized by ligation and used to transform E. poli 
HBIOI. Positive colonies were identified by restriction 
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analysis and designated pDHFR' . DNA from positive 
colonies was isolated and used to transform dam' E« COli. 

Plascnid pD2 ' was generated by cleaving pDHFR', 
and pSV40 (comprising Bam HI digested SV40 DNA cloned into 
5 the Bam HI site of pML-1 (Lusky et al.. Nature 221:79-81, 
1981)) with Bel I and Bam HI. The DNA fragments were 
resolved by gel electrophoresis, and the 4.9 kb pDHFR' 
fragment and 0.2 kH SV40 fragment were isolated. These 
fragments were used in a ligation reaction, and the 
10 resulting plasmid, designated pD2', was used to transform 
f^Q^i RRI. 

Plasmid pD2' was modified by deleting the 
"poison" sequences in the pBR322 region (Lusky et al., 
19B1, ibid.). Plasmids pD2 • and pML-1 were digested with 

15 Eco RI and Nru I. The 1.7 kb pD2 • fragment and 1.8 kb 
pML-1 fragment were isolated by gel purification, 
circularized in a ligation reaction and used to transform 
R. coli HBIOI. Positive colonies were identified using 
restriction analysis (designated pD2) and digested with 

20 Eco RI and Bel I. A 2.8 kb fragment (fragment C) was 
isolated and gel purified. 

To generate the remaining fragments used in 
constructing pD3. pDHFRIII was modified to convert the Sac 
II (Sst II) site into either a Hind III or Kpn I site. 

25 pDHFRIII was digested with Sst II and ligation reactions 
with either Hind HI or Kpn I linkers were done. The 
resultant plasmids were digested with either Hind HI or 
Kpn I and gel purified. The resultant plasmids were 
designated either pDHFRIII (Hind III) or pDHFRIII (Kpn I) . 

30 A 700 bp KpnI-Bgl II fragment (fragment A) was purified 
from pDHFRIII (Hind III) . 

The SV4 0 enhancer sequence was inserted into 
pDHFRIII (Hind III) by first digesting SV40 DNA with Hind 
III, and DNA from 5089 to 968 bp was isolated and 

35 purified. Plasmid pDHFRIII (Hind III) was phosphatased, 
and the SV40 DNA and linearized plasmid pDHFRHl (Hind 
III) were used in a ligation reaction. A 700 bp Eco RI- 
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Kpn I fragment (fragment B) was Isolated from the 
resulting plasmid. 

For the final construction of pD3, fragments A 
(50 ng) , B (50 ng) and C (10 ng) were cootbined in a 
5 ligation reaction and used to transform R - ml i rri. 
Positive colonies were isolated and plasmid DNA was 
prepared . 

Plasmid pD3 was modified to accept the insertion 
of the protein C sequence by converting the Bel I 

10 insertion site to an Bco RI site. First, the Eco RI site 
present in pD3 (the leftmost terminus in adenovirus 5 0-1) 
was converted to a Bam HI site via conventional linkering 
procedures. The resultant plasmid was transformed in 
cQli HBIOI. Plasmid DNA was prepared, and positive clones 

15 were identified by restriction analysis. 

pD3 ■ is a vector identical to pD3 except that 
the SV40 polyadenylation signal (i.e., the £V40 Bam HI 
(2533 bp) to Bcl I (2770 bp) fragment) is in the late 
orientation. Thus, pD3' contains a Bam HI ztite as the 

20 site of gene insertion. 

To generate pDX, the Eco RI site in pD3 ' was 
converted to a Bcl I site by Eco RI cleavage, incubation 
with SI nuclease and subsequent ligation vrith Bcl I 
linkers. DNA was prepared from a positively identified 

25 colony, and a 1.9 kb Xho I-Pst I fragment containing the 
altered restriction site was prepared via gel 
purification. In a second modification, Bcl I-cleaved pt33 
was ligated with Eco Rl-Bcl I adapters in order to 
generate an Eco RI site as the position for inserting a 

30 gene into the expression vector. Positive colonies %irere 
identified by restriction analysis. The resulting 
plasmid, designated pDX, has a unique Bco RI site for 
insertion of foreign genes. 

The protein C cDNA was inserted into pDX as an 

35 Eco RI fragment. Plasmids were screened by restriction 
analysis. A plasmid having the protein C inctert in the 
correct orientation with respect to the promoter elements 
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and plasmid DMA was designated pDX/PC- Because the cDNA 
insert in pDX/PC contains a ATG codon in the S' non-coding 
region, deletion mutagenesis was performed on the cDNA. 
Deletion of the three base pairs was performed according 
5 to standard procedures or oligonucleotide -directed 
mutagenesis. The pDX-based vector containing the modified 
cDNA was designated p594. 

C. Mndification of fh>> Prot^ein r PT-oceasina Site 
10 To enhance the processing of single-chain 

protein C to the two- chain form, two additional arginine 
residues were introduced immediately upstream of the 
Lysi5g-Arg]^57 cleavage site of the precursor protein, 
resulting in a cleavage site consisting of four basic 
15 amino acids, Arg-Arg-Lys-Arg (SEQ ID NO: 20) . The 
resultant mutant precursor of protein C was designated 
PC962. It contains the sequence Ser-His-Leu-Arg-Arg-Lys- 
Arg-Asp (SEQ ID NO: 22) at the cleavage site. Processing 
at the Arg-Asp bond results in a two-chain protein C 
20 molecule. 

The mutant molecule was generated by altering 
the cloned cDNA by site-specific mutagenesis (essentially 
as described by Zoller and Smith, 2:479-488, 1984, for 
the two-primer method) using the mutagenic oligonucleotide 

25 ZC962 (5'aGTCACCTGAGAAGAAAACGAGACA3' ; SEQ ID NO: 11). 
Plasmid p594 was digested with Sst I, and the 
approximately 87 bp fragment was cloned into Ni3mpll and 
single -stranded template DNA was isolated. Following 
mutagenesis, a correct clone was identified by sequencing. 

30 Replicative form DNA was isolated, digested with Sst I, 
and the protein C fragment was inserted into Sst I -cut 
pS94. Clones having the Sst I fragment inserted in the 
desired orientation were identified by restriction enzyme 
mapping. The resulting expression vector was designated 

35 pDX/PC962. 
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D. Tntronlesfi Prot^ein C Ccmer-nirt 

To facilitate the cloning of the prc^ein C cDNA, 
PC962, into pMAD, the cDNA contained in pCX/PC962 wae 
modified to incorporate Eco RV sites at the extremities of 
5 the protein C cDNA insert. A 769 bp Set Il-Pst J fragment 
encompassing the 3 ■ end of PC962 was cloned between the 
Sst II and Pst I sices of pBluescript II SK® (Stratagenci 
La Jolla, CA) . The fragment was excised with Sst II and 
Eco RV and purified. The 5' portion of PC962 uas modified 

10 by PGR. The sense oligonucleotide primer for this 
reaction covered the 5* ATG region of this cDNA and 
provided an Eco RV site upstream of this in the product. 
The antisense oligonucleotide primer covered the Sst II 
site used to generate the Sst 1 1 -Eco RV fracjment. The 

15 resulting PGR product was digested with Eco RV and Sst II 
and ligated with the Sst Il-Eco RV 3' fragment and Eco RV 
digested pMAD. The resulting plasmid. designated pG0RP9 
effectively contained the PC962 cDNA flanked by Eco RV 
sites in an Intronless fusion driven by the beta- 

20 lactoglobulin promoter. 

E. • Genomic Protein C DMA Confltruction 

A genomic DNA construct containing exons I 
through VIII was made. See, U.S. Patent 4,955,318, which 
is incorporated herein by reference, for disclcsure of the 

25 exon structure of the protein C- gene. This genomic 
construct, designated GPClO-l, changed the fieguence 16 
base pairs upstream of the ATG from the nativ-s protein C 
sequence to the beta -lactoglobulin sequence and introduced 
mutations in the propeptide cleavage site located in exon 

30 2, and the two-chain cleavage site located in exon 6, as 
described below. 

The construct was assembled using four fragments 
designated A» B, C and D and encompassed the protein C 
gene sequence from the ATG to a Bam HI site in exon VIII, 

35 imiTiediately upstream of the atop codon. This fragments 
were generated from a human genomic library in X, Charon 4A 
phage that was screened with a radiolabeled cDNA probe for 
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human protein C. The screening of the X library produced 
three clones that together mapped the entire protein C 
gene (Foster et al., 1985, ibid.). These clones were 
designated PCXl. PC\6 and FCX,8. 
5 Fragment A was a Not I to Eco RI fragment that 

contained exons I and II of the genomic sequence and was 
1698 bp. A siU^clone of PCX6 contained an Eco RI to Eco RI 
fragment and was designated pHCR4.4-l. Using pHCR4.4-l as 
a template and oligonucleotides ZC6303 (SEQ ID NO: 12) and 

10 ZC6337 (SEQ ID NO: 13). a DNA fragment was generated by 
polymerase chain reaction (PGR) . oligonucleotide ZC6303 
changed the sequence 16 base pairs 5* to the ATG sequence 
from the native protein C sequence to the equivalent 
sequence from the beta-lactoglobulin gene and introduced a 

15 Not I site. Oligonucleotide ZC6337 changed the propeptide 
cleavage site from Arg-Ile-Arg-Lys-Arg (SEQ ID NO: 24) to 
Gln-Arg-Arg-Lys-Arg (SEQ ID NO: 25) . The resulting PCR- 
generated fragment was digested with Not I and Bss HI I, 
and a 1402 base pair fragment was gel purified and 

20 designated Al. A second fragment was prepared using a X 
gtll clone of FCXl as a template with oligonucleotides 
ZC6306 (SEQ ID NO: 14) and ZC6338 (SEQ ID MO: 15) in a 
polymerase chain reaction. The resulting DNA fragment, 
designated A3, was digested with Bss HII and Eco RI and 

25 gel purified, resulting in a 296 base pair fragment. 
Fragments Al and A3 were ligated into the Bluescript II KS 
® phagemid vector (Stratagene, La JolIa« CA) . The 
resulting plasmid, designated GPC 2-2, was digested with 
Not I and Eco RI, gel purified and the Not I -Eco RI DNA 

30 fragment was designated Fragment A. 

pCR 2-14 is a subclone that contains an Eco RI 
to Eco Ri DNA fragment of PCXB (Foster et al., 1985, 
ibid.). The plasmid was digested with Eco RI and Sst I 
and gel purified. The resulting fragment was designated 

35 Fragment B. 

Plasmid pCR 2-14 was used as template DNA with 
oligonucleotides ZC6373 {SEQ ID NO: 16) and ZC6305 (SEQ ID ' 
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NO: 17) , which introduced an A£l II site and the RRKR 
mutation of the native (KR) two-chain cleavage site, in a 
polymerase chain reaction. The resulting PCR-generated 
fragment was digested with Bgl II and Afl II and gel 

5 purified, resulting in a 1441 base pair fragment, 
designated El. Fragment El was used in a ligation 
reaction with oligonucleotides ZC6302 (SSQ ID NO: 18) and 
ZC6304 {SEQ ID NO: 19) . These oligonucleotides form Afl 
II and Sst II restriction sites when annealed and were 

Q ligated to the 3' end of fragment Bl| resulting in a 
fragment with a 5' Bgl II site and a 3' Sst II site. This 
fragment was used in a ligation reaction with a Bam HI -Sst 
II digested Bluescript II KS* phageiaid vector 
(Stratagene) . The resulting plasmid was designated GPC &• 

5 5 and digested with Sst I and Sst II, generating a 626 
base pair fragment, designated Fragment C. 

A fourth fragment was generated by digestion of 
a genomic subclone (pHCB7-l) of PCX8. pHCB7-l contained a 
Bgl II to Bgl II fragment that encompassed exons VI 

0 through VIII. pHCB7-i was digested with Sst IZ and Bam HI 
and a 2702 base pair fragment was gel purified. The 
fragment was designated Fragment D. 

A five -part ligation reaction was prepared using 
Not I and Bam HI digested and linearized Bluescript II KS® 

!S phagemid vector (Stratagene) with Fragment A (5* Not I to 
3' Eco RIl that contained exons I and II, Fragment B (5* 
Eco RI to 3' Sst I) that contained exons III. XV and V, 
Fragment C (S' Sst I to 3 ' Set II) that contained the 5' 
portion of exon VI and Fragment D {5' Sst II to 3' Bam HI) 

iO that contained the remaining 3' portion of csxon VI and 
exons VII and VIII. The resulting DNA was 8950 base pairs 
and designated GPC 10-1. 

GPClO-1 was originally generated with BLG 
sequences and a Not I site upstream of the ATO initiator 

15 codon and modifications to both cleavage sites. A clone, 
designated pFC12/BS, was generated to ensure that the 5* 
Not I site of GPClO-l would not introduce secondary 
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Structure Into loRNA molecules that could hinder 
translation. pPC12/BS was generated using PGR 

amplification of a 1 kb Not I -Sea Z fragnient that covered 
the 5 ' region of the protein C gene and contained the 
5 wild-type ATG codon environment. This introduced an Eco 
RV site immediately downstream of the Not I site, adjacent 
to the ATG codon, and a Bam HI site was incorporated 3' of 
the Sea I site to facilitate cloning. Following a Hot 
I/Bam HI digestion, the PGR product was cloned into Not I- 

10 Bam HI, digested Bluescript II KS® phagemid vector 
(Stratagene) . The Not I -Eco RV-Sca I fragment present in 
pPC12/BS was excised, purified and ligated to GPClO-1, 
which had been linearized with Not I and partially 
digested with Sea I (the pUC ampillicin gene has an 

15 internal Sea I site) . 'The resulting clone was designated 
GPClO-2 and possesses an Eco RV site immediately upstream 
of the ATG initiator codon. 

GPClO-1 and GPClO-2 both terminated at the final 
Bam HI site in exon VIII of the protein C gene. To 

20 reconstitute the 56 bp of sequence, ending at the 
terrnlnation codon, two oligonucleotides were synthesized 
with flanking Bam HI (S') and Bgl II (3') restriction 
sites. Following annealing of the oligonucleotides, the 
product was cloned into Bam HI digested pBST-4^ to generate 

25 plasmid pPC3 ' . pBST-f is a derivative of pBS (Stratagene) 
with a new polylinker. The addition of the polylinker 
added Bgl II, Xho I, Nar I and Cla I restriction sites 
from the vector polylinker downstream of the destroyed Bgl 
II site of the oligonucleotide construct. 

30 The Not I -Bam HI fragment of GPClO-1 was 

subcloned into Not I/Bam HI digested pPC3 ' to add 3 ■ 
coding sequences of protein C, the TAG termination codon 
followed by Bgl Il-Xho I-Nar I-Cla I. The 3' region of 
the protein C gene beginning with the Eco RV site in 

35 intron V was excised from this plasmid on an Eco RV-Cla I 
fragment . 
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The Eco RV-Eco RV fragment from GPClO-2, 
covering the 5' portion of the protein C ge:ie, and the 
above Eco Rl-Cla I fragment covering the 3* portion of the 
protein C gene were combined between the Eco RV and Cla T 
5 sites of pMAD6 (SEQ ID NO: 23) to generate pC0RP13 . This 
effectively placed a genomic portion of the prctein C gene 
with modified propeptide and two-chain clezivage sites 
under the control of the beta- lac toglobulin promoter. 

A further genomic construct was generated from 

10 pCORP13 that contained only the modified two-chain 
cleavage site. This was achieved using PCR amplification 
to modify two fragments which resulting in restoration of 
the coding capability of exon 2 from the mutant Gln-Arg- 
Arg-Lys-Arg (SEQ ID NO: 25) to the wild- type ^jg-Ile-Arg- 

15 l>ys-Arg (SEQ ID NO: 24) . pC0RP13 was used as template for 
these reactions. The first fragment was 1.3 kb, which 
encompassed the 5' end of the protein C gene up to the Bam 
HI site in exon 2. For this reason, the sense primer was 
designed to add a Hind III site 5' to the Eco RV site 

20 proximal to the ATG initiation codon. The antisense 
primer was designed to restore the wild-type sequences in 
exon 2, which included a restored Bam HI site. A second 
fragment of 0.2 kb from the Bam HI site in exon 2 to the 
Xho I site in intron 2, was also amplified. The two 

25 fragments were combined in pGEMIX (Promegai Madison, WI) 
to generate pGEMPCl.S. A 7.5 kb Xho I fragment from pCORP 
13 was ligated to Xho I digested pGEMPCl.S to generate a 
complete protein C genomic sequence covering exons 1-8 
with a wild-type propeptide cleavage site and a modified 

30 two-chain cleavage site. The plasmid was designated 
pG&MFC14 . The sequence was excised from pGEMPC14 as a 
Hind Ill/Sal I fragment. The DNA termini were repaired 
using a Klenow reaction and the fragment wast blunt -end 
ligated into Eco RV digested pMAD6 (SEQ ID ;jrO: 23) to 

35 generate pC0RP14 . 
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Mice for initial breeding stocks (C57BL€J« 
CBACA.) were obtained from Harlan Olac Ltd. (Bicester, UK) . 
These were mated in pairs to produce Fl hybrid cross 
5 (B6CBAF11 for recipient females, superovulated females, 
stud males and vasectomized males. All animals were kept 
on a 14 hour light/10 hour dark cycle and fed water and 
food (Special Diet Services KM3, Edinburgh, Scotland) ad 
libitum, 

10 Transgenic mice were generated essentially as 

described in Hogan et al., Manipulatiiny the Mouae Embryn? 
A Laboratory Manual. Cold Spring Harbor Laboratory, 1966, 
which is incorporated herein by reference in its entirety. 
Female B6CBAF1 animals were superovulated at 4-5 weeks of 

15 age by an i.p. injection of pregnant mares* serum 
gonadotrophin (FOLLIGON,. Vet-Drug, Falkirk, Scotland) (5 
iu} followed by an i.p. injection of human chorionic 
gonadotrophin (CHORULON, Vet -Drug, Falkirk, Scotland) (5 
iu) 45 hours later. They were then mated with a stud male 

20 overnight. Such females were next examined for copulation 
plugs. Those that had mated were sacrificed, and their 
eggs were collected for microinjection. 

DNA was injected into the fertilized eggs as 
described in Hogan et al. (ibid.). Briefly, the vector 

25 containing the protein C expression unit was digested with 
Mlu I, and the expression unit was isolated by sucrose 
gradient centrifugation. All chemicals used were reagent 
grade (Sigma Chemical Co., St. Louis, MO, U.S.A.), and all 
solutions were sterile and nuclease-f ree. Solutions of 

30 20% and 40% sucrose in 1 M NaCl, 20 mM Tris pH 8.0, 5 mM 
EDTA were prepared using UHP water and filter sterilized. 
A 30% sucrose solution was prepared by mixing equal 
volumes of the 20% and 40% solutions. A gradient was 
prepared. by layering 0.5 ml steps of the 40%, 30% and 20% 

35 sucrose solutions into a 2 ml polyallomer tube and allowed 
to stand for one hour. 100 ^l of DNA solution (max. 8 jig 
DNA) was loaded onto the top of the gradient,^ and the 
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gradient was centrifuged for 17-20 hours at 26,000 rpra, 
IS^C in a BecJcman TLaoo ultracentrifuge uaicg a TLS-55 
rotor (Beckman Instruments, Fullerton,- CA, USA) . 
Gradients were fractionated by puncturing the tube bottom 
5 with a 20 ga. needle and collecting drops in a 96 well 
microtiter plate. 3 fil aliquots were analyzed on a 1% 
agarose mini-gel. Fractions containing the protein C DNA 
fragment were pooled and ethanol precipitated overnight at 
-20^C in 0.3M sodium acetate. DNA pellets were resuspended 

10 in SQ-100 fil UHP water and quantitated by lluorimetry . 
The protein C expression unit was diluted in Dulbecco's 
phosphate buffered saline without calcium 'and magnesium 
(containing, per liter, 0.2 g kci, 0.2 g KH2PO4, 8.0 g 
NaCl, 1.15 g Na2HP04) or in TE (10 mM Tris-HCl, 1 mM EDTA 

15 pH 7.5). DNA concentration is adjusted to about 6 |ig/ml, 
prior to injection into the eggs ("2 pi total CNA solution 
per egg) . 

Recipient females of 6-8 weeks of age are 
prepared by mating B6CBAF1 females in natural estrus with 

20 vaaectomized males. Females possessing copulation plugs 
are then kept for transfer of microinjected egg». 

Following birth of potential transgenic animals , . 
tail biopsies are taken, under anesthesia, at four weeks 
of age. Tissue samples are placed in 2 ml of tail buffer 

25 (0.3 M Na acetate, 50 mM NaCl, 1.5 mM MgCl2. 10 mM Tris- 
HCl, pH 8.5, 0.5% NP40, 0.5* Tween 20) containing 200 
Hg/nl proteinase K (Boehringer Mannheim, Mannheim, 
Germany) and vortexed. The samples are shaken (250 rpm) 
at S5»-60«c for 3 hours to overnight. DNA prepared from 

30 biopsy samples is examined for the presence of the 
injected constructs by PGR and Southern blotting. The 
digested tissue is vigorously vortexed, and 5 ^1 aliquots 
are placed in 0.5 ml microcentrifuge tubes. Positive and 
negative tail samples are included as control». Forty ^1 

35 of silicone oil (BDH, Poole, UK) is added to each tube, 
and the tubes are briefly centrifuged. The tubes are 
incubated in the heating block of a thermal cycler (e.g. 
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Omni-gene, Hybaid, Teddington, UK) to 95»C for 10 minutes. 
Following this, each tube has a 45 11 aliquot of PGR mix 
added such that the final composition of each reaction mix 
is: SO mM KCl; 2 mM MgCl2; 10 rtM Tris-HCl (pH 8.3); 0.01* 
5 gelatin; 0.1* NP40, 10* DMSO; 500 nM each primer, 200 IM 
dNTPs; 0.02 U/11 Taq polymerase (Boehrlnger Mannheim, 
Mannheim, Germany} . The tubes are then cycled through 30 
repeated temperature changes as required by the particular 
primers used. The primers may be varied but in all cases 

10 must target the BLG promoter region. This is specific for 
the injected DNA fragments because the mouse does not have 
a BLG gene. Twelve 11 of 5x loading buffer containing 
Orange G marker dye (0.25* Orange G (Sigma) 15* Ficoll 
type 400 (Pharmacia Biosystems Ltd., Milton Keynes. UK)) 

15 is then added to each tube, and the reaction mixtures are 
electrophoresed on a 1.6* agarose gel containing ethidium 
bromide (Sigma) until the marker dye has migrated 2/3 of 
the length of the gel. The gel is visualized with a UV 
light source emitting a wavelength of 254 nm. Transgenic 

20 mice having one or more of the injected DNA fragments are 
identified by this approach. 

Positive tail samples are processed to obtain 
pure DNA. The DNA samples are screened by Southern 
blotting using a BLG promocer probe (nucleotides 2523-4253 

25 of SEQ ID NO: 7) . 

Southern blot analysis of transgenic mice 
prepared essentially as described above demonstrated that 
approximately 10% of progeny contained protein C 
sequences. Examination of milk from positive animals by 

30 reducing SDS polyacrylamide gel electrophoresis 
demonstrated the presence of protein C at concentrations 
up to 1 mg/ml. 

Example A 

35 Donor ewes are treaced with an intravaglnal 

progesterone -impregnated sponge (CHRONOGEST Goat Sponge, 
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mtervet, Cambridge, UK) on day 0. Sponges are left in 
situ for ten or twelve days. 

Superovulation is induced by treatment of donor 
ewes with a total of one unit of ovine follicle 
5 stimulating hormone (OFSH) (OVAGEN, Horizon Animal 
Reproduction Technology Pty. Ltd., New Zealand) 
administered in eight intramuscular injections of 0.125 
units per injection starting at 5:00 pm on day -4 and 
ending at 8:00 am on day 0. Donors are injected 

10 intramuscularly with 0.5 ml of a luteo»ytic agent 
(E5TRUMATE, Vet -Drug] on day -4 to cause regression of the 
corpus luteum, to allow return to estrus and ovulation. 
To synchronize ovulation, the donor animals nre injected 
intramuscularly with 2 ml of a synthetic releatsing hormone 

15 analog {RECEPTAL, Vet-Drug) at 5:00 pm on day 0. 

Donors are starved of food and water for at 
least 12 hours before artificial insemination (A.Z.). The 
animals are artificially inseminated by Intrauterine 
laparoscopy under sedation and local anesthesia on day 1. 

20 Either xylazine (ROMPUN, Vet -Drug) at a dose rate of 0.05- 

0. 1 ml per 10 kg bodyweight or AC? injection 10 mg/ml 
(Vet-Drug) at a dose rate of 0.1 ml per 10 Jcg bodyweight 
is injected intramuscularly approximately fifteen minutes 
before A.I. to provide sedation. A.I. is carried out 

25 using freshly collected semen from a Poll Dorset ram. 
Semen is diluted with equal parts of filtered phosphate 
buffered saline, and 0.2 ml of the diluted semen is 
injected per uterine horn. Immediately pre- or post-A.I., 
donors are given an intramuscular injection of AMOXYPEN 

30 {Vet-Drug) . 

Fertilized eggs are recovered on day 2 following 
starvation of donors of food and water from 5:00 pm on day 

1. Recovery is carried out under general anesthesia 
induced by an intravenous injection of 5% thiopentone 

35 sodium (INTRAVAL SODIUM, Vet-Drug) at a dose rate of 3 ml 
per 10 leg bodyweight. Anesthesia is ma:.ntained by 
inhalation of 1-2% Halothane/02/H2O* To recover the 
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fertilized eggs, a laparotomy incision is made, and the 
uterus is exteriorized. The eggs are recovered by 
retrograde flushing of the oviducts with Ovum Culture 
Medium (Advanced Protein Products, Brierly Hill, West 
5 Midlands, UK) supplemented with bovine serum albumin of 
New Zealand origin. After flushing, the uterus is 
returned to the abdomen, and the incision is closed. 
Donors are allowed to recover post-operatively or are 
euthanized. Donors that are allowed to recover are given 

10 an intramuscular injection of Amoxypen L.A. at the 
manufacturer's reconunended dose rate immediately pre- or 
pos t - opera t ive ly . 

Plasmids containing ' the protein C DHA are 
digested with Mlu I, and the expression unit fragments are 

15 recovered and purified on sucrose density gradients. The 
fragment concentrations are determined by fluorimetry and 
diluted in Dulbecco's phosphate buffered saline without 
calcium and magnesium or TE as described above. The 
concentration is adjusted to 6 Ig/ml, and approximately 2 

20 pi of the mixture is microinjected into one pronucleus of 
each fertilized eggs with visible pronuclei. 

All fertilized eggs surviving pronuclear 
microinjection are cultured in vitro at SS-S^C in an 
atmosphere of 5% C02:5% 02:90V H2 and about .100* humidity 

25 in a bicarbonate buffered synthetic oviduct medium (see 
Table) supplemented with 20% v/v vasectomlzed ram serum. 
The serum may be heat inactivated at 56«C for 30 minutes 
and stored frozen at -20'C prior to use. The fertilized 
eggs are cultured for a suitable period of time to allow 

30 early embryo mortality (caused by the manipulation 
techniques) to occur. These dead or arrested embryos are 
discarded. Embryos having developed to 5 or 6 cell 
divisions are transferred to synchronized recipient ewes. 
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Table 

Synthetic Oviduct Medium 



stock A fLaste 3 MonthaV 

NaCl 

KCl 

KH2PO4 
MgS04 . 7H2O 
Penicillin 

Sodium Lactate 60\ syrup 
Super H2O 

Stock B f Last a 5 weeks! 
NaHC03 
Phenol red 
Super H2O 

Stock C ILafifg a weeks! 
Sodium Pyruvate 
Super H2O 

S^ock D fLafit^e 3 moT)|;hB) 

CaCl2.2H20 

Super H2O 

Stock E fLaars 3 monthal 
Hepes 

Phenol red 
Super H2O 



6.29 g 
0.534 g 
0.162 g 
0.182 g 
0.06 g 
0.6 mis 
99.4 mis 



0.21 g 
0.001 3 
10 mis 



0.051 g 
10 mla 



0.262 J 
10 mis 



O.6SI q 
0.001 g 
10 mis 



To make up lOmle of Bicarbonate Buffered 

STOCK A 1 ml 

STOCK B 1 ml 

STOCK C 0.07 ml 

STOCK D 0.1 ml 

Super H2O 7.83 ml 

Osmolarity should be 265-285 mOsm. 

Add 2.5 ml o£ heat inactivated shesp serum 

and filter sterilize. 

To make up IQ ml a of HEPES Buffered Medium 

STOCK A 1 ml 

STOCK B 0.2 ml 

STOCK C 0.07 mil 

STOCK D 0.1 ml 

STOCK E 0.8 ml 

Super H20 7.83 ml 
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Table, conn. 

Osmolarity should be 265-285 mOsm. 
5 Add 2.5 ml of heat inactivated sheep serum 

and filter sterilize. 

Recipient ewes are treated with an intravaglnal 
progesterone- inipregnated sponge CChronogest Ewe Sponge or 
Chronogest Ewe-Lamb Sponge. Intervet) left in situ for 10 
or 12 days. The ewes are injected intramuscularly with 
1.5 ml C300 iu) of a follicle stimulating hormone 
substitute (P.M.S.G., Intervet) and with 0.5 ml of a 
luteolytic agent (Estruraate, Coopers Pitman-Moore) at 
sponge removal on day -1. The- ewes are tested for estrus 
with a vasectomized ram between 8:00 am and 5:00 pm on 
days 0 and i. 

Embryos surviving in vitro culture are returned 
to recipients (starved from 5:00 pm on day S or 6) on day 
6 or 7. Embryo transfer is carried out under general 
anesthesia as described above. The uterus is exteriorized 
via a laparotomy incision with or without laparoscopy. 
Embryos are returned to one or both uterine horns only in 
ewes with at least one suitable corpora lutea. After 
replacement of the uterus, the abdomen is closed, and the 
recipients are allowed to recover. The animals are given 
an intramuscular injection of Antoxypen L.A. at the 
manufacturer's recommended dose rate immediately pre- or 
post - oper a t ive ly . 

Lambs are identified by ear tags and left with 
their dams for rearing. Ewes and lambs are either housed 
and fed complete diet concentrates and other supplements 
and or ad lib. hay, or are let out to grass. 

Within the first week of life (or as soon 
thereafter as possible without prejudicing health) , each 
lamb is tested for the presence of the heterologous DNA by 
two sampling procedures. Following tail biopsy, within a 
wee]c, a 10 ml blood sample is taJcen from the jugular vein 
into an EDTA vacutainer. Tissue samples are ta)cen by tail 
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biopsy as soon as possible after the tail has become 
desensitized after the application of a rubber elastrator 
ring to its projcimal third (usually within 200 minutes 
after "tailing") . The tissue is placed immediately in a 
5 solution of tail buffer. Tail samples are kept at room 
temperature and analyzed on the day of colle::tion. All 
Iambs are given an intramuscular injection of Amoxypen 
L.A. at the manufacturer's recommended dose rate 
immediately post-biopsy, and the cut end of zhe tail is 

10 sprayed with an antibiotic spray. 

DNA is extracted from sheep blood by first 
separating white blood cells. A 10 ml sample of blood is 
diluted in 20 ml of Hank's buffered saline (HE-S; obtained 
from Sigma Chemical Co.). Ten ml of the diluted blood is 

15 layered over 5 ml of Histopaque (Sigma) in each of two 15 
ml screw-capped tubes. The tubes are centrifuged at 3000 
rpm (2000 X g max.), low brake for 15 minutes at room 
tenperature. White cell interfaces are removed to a clean 
15 ml tube and diluted to 15 ml in HBS. The diluted cells 

20 are spun at 3000 rpm for 10 minutes at room temperature, 
and the cell pellet is recovered and resuspended in 2-5 ml 
of tail buffer. 

To extract DNA from the white cells, 101 SDS is 
added to the resuspended cells to a final concentration of 

25 1%, and the tube is inverted to mix the solution. One mg 
of fresh proteinase K solution is added, and the mixture 
is Incubated overnight at 4 5°C. DNA is extracted using an 
equal volume of phenol /chloroform (x3) and 
chloroform/ isoamyl alcohol (xl) . The DMA is then 

30 precipitated by adding o.l volume of 3 M NaCAc and 2 
volumes of ethanol, and the tube is inverted to mix. The 
precipitated DNA is spooled out using a clean glass rod 
with a sealed end. The spool is washed in 70% ethanol, 
and the DNA is allowed to partially dry, then is 

35 redissolved in TE (10 mM Tris-HCl, I mM EDTA, pll 7-S) . 
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DNA samples from blood and tail are analyzed by 
Southern blotting using probes for the BLG promoter region 
and the protein C coding regions. 

5 Example 5 

A founder female animal, designated 30851, which 
is transgenic for both BLG and pCORPS was generated. She 
has given rise to two sons and a transgenic daughter, 
designated 40387. Recombinant transgenic protein C was 

10 purified from milk (from 308S1) by a single chromatography 
step using a calcium- dependent monoclonal antibody 
affinity column. Briefly, the milk samples were pooled up 
to a volume of 40 ml. Two volumes of ice-cold 1 X TBS (50 
mM Tris-HCl, 150 mM NaCl pH 6.5) and 200 iiiM EDTA, pH 6.5 

15 were added co solubilise the caseins. The EDTA-treated 
milk solution was centrifuged at 3.5,000 rpm for 30 minutes 
at 4*0 in a JA20 rotor (Beckman Instruments, Irvine, CA) . 
After centrifugation, the upper lipid phase and the small 
pellet were discarded. 

20 The EDTA-treated milk was diluted with an equal 

volume of ice-cold 1 X TBS and 133 mM CaCla while 
stirring. A cloudy precipitate formed upon addition of 
the CaCl2- The pH was quickly adjusted by addition of a 
few drops of 4 M NaOH, and the precipitate was 

25 redissolved. Any remaining insoluble material was removed 
by filtration through a 0.45 (im filter. 

The optical density of the solubilised milk was 
measured at 280 nm, and the protein concentration was 
calculated. The milk was diluted to a protein 

30 concentration of 10 mg/rol using 1 X TBS containing CaCl2 
to give a final Ca++ concentration of 25 mM. The milk was 
used to resuspend antibody- Sepharose that carried the 
immobilized Ca++- dependent monoclonal antibody PCL-2, and 
had been washed in 1 X TBS and 25 mM CaCl2- PCL-2 is a 

35 monoclonal antibody that binds single chain and two chain 
protein C, whether or not they are gamma-carboxylated. 
The railk-Sepharose mixture was incubated overnight at 4''C. 
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The matrix was washed twice in batch with 1 x 
TBS and 25 niM CaCl2 and packed into a glass ::olumn. The 
resin was washed at a flow rate of 1 ml/min with a calcium 
containing buffer and a stable baseline was achieved 
5 before the bound protein was eluted with an isocratic 
elution using 1 X TBS and 25 mM EDTA, pH 6.5. Fractions 
containing protein C were pooled and concentrated to 
approximately 1 ml using an Ami con ultrafiltration unit 
with a 10 kDa cut-off membrane (Amicon, Danverf}| MA) . 

10 The monoclonal antibody, PCL-2, wa£ coupled to 

the activated Sepharose 4B as^ follows: 1 g (3.5 ml of gel) 
of cyanogen bromide activated Sepharose 4B (Pharmacia LKB 
Biotechnology, Piscataway, NJ) was swollen fo:r 15 minutes 
in 1 roM HCl. The swollen gel was resuspended in 0.1 M 

15 NaHC03, 0.5 M NaCl pH 8.3 and washed several times. The 
washed gel was resuspended in 11 ml of monoclonal antibody 
solution (PCL-2, 3.5 mg/ml in bicarbonate buffer pH 8.3) 
with a coupling ratio of approximately 10 mg/ml gel. 
Coupling was allowed to proceed for 2 h at room 

20 temperature on a rotary mixer, and the gel was recovered 
by gentle centrifugation. The monoclonal supernatant was 
removed and replaced by 1 M ethanolaraine in order to block 
any remaining sites or. the Sepharose. Blocking was 
performed overnight at 4''C. Excess adsorbed protein was 

25 removed by sequential acid and alkali washes (O.i M 
acetate, 0.5 M NaCl pH 4.0; 0.1 M NaHOOa , 0.5 M NaCl pH 
8.3), and the coupled gel was stored in 50 inH Tris-HCl, 
ISO mM NaCl pH £.5. 0.02% azide. 

30 Examplg 6 

Samples of purified recombinant transgenic 
protein C were compared with plasma-derived protein C and 
a plasma-derived activated protein C (APC) preparations. 
Samples were run on SDS PAGE 4-20% acrylamide gradient 
35 gels under reducing conditions and silver stained for 
protein. 
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The plasma- derived material bHows the presence 
o£ a heavy- chain doublet around 44 kDa [Figure 1« Lane 1) . 
This has been reported to be due to partial occupancy of 
the three possible N- linked glycosylation sites on the 
5 molecule. A similar doublet, although of a slightly lower 
mass presumably due to some subtle change in glycosylation 
profile, has also been seen with the transgenic protein C. 
The light chain was visible around 22 kDa for both 
preparations. Significantly, in the case of the plasma- 

10 derived - material uncleayed single-chain was clearly 
visible above the heavy chain doublet. Plasma -derived 
protein normally contained 5-10 percent of this inactive 
material. In contrast, the transgenic protein C contains 
no obvious single chain by this gel analysis. Therefore/ 

15 it contains less than a few percent at most of inactive 
material. This most likely reflects the increased 
efficiency of cleavage of the modified inter- chain site. 
In further support of this observation no single chain was 
visible by direct western blot analysis of transgenic 

20 sheep milk (4 0367, expression level 300 (ig/ml) . 

The purified transgenic protein C was further 
characterized as follows: 
A. Rt^TfiA 

An enzyme- linked immunosorbent assay (ELISA) for 
25 protein C was done as follows: Affinity-purified 
polyclonal antibody to human protein C (100 ^1 of 1 pg/ml 
in 0.1 M Na2C03, pH 9.6) was added to each well of a 96- 
well microtiter plate, and the plates were incubated 
overnight at 4'C. The wells were then washed three times 
30 with phosphate buffered saline (PBS) containing 0.05V 
Tween-20 and incubated with 100 ^1 of 1% bovine serum 
albumin (BSA) , 0.05% Tween-20 in PBS at 4'*C overnight. The 
plates were then rinsed several times with PBS, air dried 
and stored at 4°C. To assay samples, 100 ^l of each sample 
35 was Incubated for 1 h at 37'C with a biotin-conjugated 
sheep polyclonal antibody to protein C (30 ng/ml) in PBS 
containing 1% BSA and 0.05% Tween-20. After Incubation, 
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the welle were rinsed with PBS, and alXaline phosphatase 
activity was measured by the addition of 100 \il of 
phosphatase substrate (Sigina, St. Louis, MO) in 10% 
diethanolamine. pH 9. a, containing 0.3 itiM MgCl2. The 
5 absorbance at 405 nm was read on a microliter plate 
reader. Quantitation was by comparison with a standard 
curve using plasma -derived protein C quantitated by amino 
acid analysis. 

10 B. Amino -Terminal flp quencing 

Amino -terminal sequencing of the transgenic 
material was performed to ascertain the extent of 
prosequence removal and to evaluate the presencie of gamma - 
carboxylation. There were three possible N-terminal 

15 sequences of protein C. These were: 1) Proaeqiience which 
directs gamma -carboxylation and could have remained on the 
light chain if the first cleavage site was incompletely 
processed, 2) the light chain and 3) the heavy chain. N- 
terminal sequencing of protein C obtained froir transgenic 

20 milk should have contained only the latter two sequences 
if correct processing had occurred at both of the cleavage 
sites. Amino-terminal sequencing would have also been 
expected to reveal the presence of gamma -carboxylation in 
the light chain. There are nine sites of carbc-xylation in 

25 the first twenty-nine amino acids of the light chain. On 
an analysis of released amino acids, the PTH-gamma 
carboxylic acid derivatives eluted from the HPIC column in 
the break-through and could therefore be analyzed. Thus, 
a gamma carboxylic acid showed up on the ami no- terminal 

30 sequence as a space rather than a glutamic acid. 

The yields of amino acids in pmol released from 
the sequencing of approximately 27 pmol (1.4 pi} of 
purified transgenic protein C corresponded weM to those 
expected for an equimolar mixture of light and heavy 

35 chains, and no obvious sequence was discernible for the 
prosequence. Moreover, no other aberrant sequences were 
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detected, thus indicating a lack of inappropriate 
proteolytic cleavages. 

As stated previously, gannna-carboxylated 
glutamate residues were expected to sequence as blanks 
5 using standard instrument conditions. However, sequencing 
protein C gives a double sequence which Toust be 
deconvoluted using knowledge of the expected light and 
heavy chain sequences. Normally, if the light chain alone 
were sequenced the gla residues at positions six and seven 

10 would appear ae blanks. However when sequenced as intact 
protein C, the heavy chain sequence contains a glutamate 
residue at . position six. Therefore, the only indirect 
confirmation of the presence of a gla residue in the light 
chain was the absence of glutamate at position seven which 

15 was not 'over written' by a glutamate In the heavy chain 
(Figure 2). Two other indirect confirmations of the 
presence of gamma carboxylation of the transgenic product 
are described below. 

20 C. MasQ Analyg^g nf rhP PuriCif>ri T.ioht Chain 

The protein sequence of the transgenic -derived 
protein C precursor had been modified with an Arg-Arg-Lys- 
Arg (SEQ ID NO: 20) cleavage site between the light and 
heavy chains to promote more efficient cleavage of the 

25 single chain to 2 -chain form. Western blot analysis of 
the transgenic protein C milk and examination of the 
purified protein C on reducing gels had already confirmed 
that efficient cleavage had occurred. Normally during 
secretion, but after cleavage of the plasma- derived 

30 material, the two basic amino acids at the carboxy- 
terminus of the light chain are trimmed back by a basic 
carboxypeptide. Establishing whether the carboxy- terminus 
of the transgenic protein C light chain had been processed 
to remove the two extra basic amino acids introduced by 

35 this modification, as well as the two natural ones, was 
achieved by measuring the mass of the purified light chain 
in a quadropole instrument using on-line liquid 
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chromatography and electro- spray ionization. In order to 
achieve this, all of the cysteine residues of protein C 
were reduced and alkylated, and then the two chains were 
separated by reversed- phase chromatography. 

5 

CI. Reductive Alkvlatlon 

Because protein C is heavily crosa-linked for a 
molecule of approximately 52 kOa. with twelve disulfide 
bridges (17 of the 24 cysteines involved are Ln the light 

10 chain) , it was necessary to reductively alkylate the 
entire protein before attempting to separate the chains by 
reversed-phase chromatography. In view of the large 
number of cysteines in the light chain, alkylatation was 
done with iodoacetamide, in place of the more commonly 

15 used vinyl pyridine, to prevent the molecule from becoming 
excessively hydrophobic . 

The transgenic protein C material (6 nmol of 
protein or 144 pmol of thiol) was reductively alkylated as 
follows: 0.5 mg of protein C (by ELISA) in 0.5 ml of TBS 

20 was added to 50 Ml of 1 M Tris pH 8.0, 450 jil. water, 570 
mg guanidiniuTti chloride, and 10 jil at 50 mg/m3. DTT (0.3 \x 
mol representing a 20 fold excess of added thiol over 
cysteine thiol. The mixture' was incubated for 2 hours at 
37"C. After incubation, 20 \l1 at 120 mg/ml iDdoacetamide 

25 (0.6 M representing a 2 fold' excess over DTT on a molar 
basis) was added, and the mixture was incubsited in the 
dark for one hour at 4*C» The reaction was quenched by 
adding 50 ^1 at 50 mg/ml DTT representing a 2.5 fold 
excess over iodoacetamide . The sample (final volume 1.5 

30 ml) was stored at -20*'C until analysis. 

D. Pni-if ^ra^inn of t-he lAaht rhain 

Purification of protein C light chain was 
achieved using a large pore polystyrene column with 
35 di vinyl benzene interactive groups (PLRP-S, 4000A, Bum, 
2.1 mm ID: Polymer Laboratories, Shropshire. UK) . The 
optimum conditions for separation of the heavy and light 
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chains were jdetermined to be: solvent A (0.1% TFA) and 
solvent B (lOOV acetonitrile) at a flow of 0.5 nd/min with 
a detector wavelength of 215 nm and a gradient of 30 to 
60% solvent B over 60 min. 
5 Fractions were collected across the eluted 

peaks, and samples (10 jil) were analyzed by SDS PAGE on 4- 
20% gradient acrylamide gels under non- reducing 
conditions. The light chain (fractions 3 to 5) was 
completely resolved from both the heavy chain (fractions 7 
10 to 9) and a single fraction (6) which contained a mixture 
of heavy chain and what appeared to be unglycosylated 
light chain. 

A sample containing fully resolved light chain 
was prepared for deglycosylation by centrifugal 

15 evaporation under reduced pressure at room temperature. 
Deglycosylation was carried out using peptide N-glycanase 
{PNGase; Oxford Glycosystems, Oxford, XJK) . The protein 
sample was redissolved in 50 ^1 of buffer and incubated 
overnight with 1 unit (5 ^1) PNGase, according to 

20 manufacturer's specifications. 

The light chain was purified from reduced and 
allcylated plasma -de rived protein C by the same method emd 
deglycosylated for further analysis. 

25 E. Analysis by Mass SpfintroacQpy 

Samples of purified light chain were subjected 
to mass analysis using a liquid chromatography 
elect rospray interface to a Sciex Quadropole Nass Analyser 
(Sciex/Perkin Elmer, Toronto, CA) . The LC system used a 

30 0.5 mm ID column packed with FLRP-S 4000A, 8|im resin 
(Polymer Laboratories) . The solvent system contained 
buffer A (0.1% formic acid), buffer B (0.1% formic acid 
and a 5:2 (v/v) mixture of ethanol to propan-l-ol) . The 
gradient used was from 5-60% buffer B over 35 minutes at a 

35 flow rate of 25 ^1 per ninute. The outflow of the column 
was linked via a UV detector to the mass spectrometer 
which was run in positive-ion mode. 
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The purified and deglycosylated transgenic light 
chain was analyzed and gave a relatively weak spectrum 
which was reconstructed to give two components with masses 
of 18,911.0 and 18,971.0. The plasma light chtiin was also 
5 analyzed and gave a stronger signal with a single major 
component. The spectrum of the plasma light chain was 
reconstructed to give a single mass of 18,970.0. 

The predicted mass for the light che.in carrying 
nine gamma -carboxy glutamic acids, one p-hydroxy aspartic 

10 acid and seventeen carbamidomethyl cysteine residues and 
ending with Leui55 was 18966.9723, which is very close to 
the masses detected for the transgenic (18,971.0) and 
plasma -derived {18,970.0) light chains. The small 
differences in mass were well within the accuracy 

IS limitations for this instrument, particularly with the LC 
delivery. This shows that the mass of the redirect ively- 
alkylated and deglycosylated transgenic light chain is 
essentially identical to that for the pliisma-derived 
protein C. This implies that both molecules have 

20 undergone the same post-translational modifications and 
that the transgenic material is fully gamma ce.rboxylated, 
has had all four basic amino acids trimmed back from the 
carboxy- terminus of the light chain and ha.i single ^ 
hydroxy -alanine . 

2S 

F. Activity Meaaurempntfi 

The activity of the transgenic protein c was 
compared with that of the pi as ma -de rived material in a 
coagulation assay. First each sample of protein C. 

30 quantltated by amino acid composition analysis. was 
activated by incubation with Protac, a snake venom 
(American Diagnostica Inc, Greenwich, CT) at a venom to 
protein ratio of 1 Unit Protac: 10 |ig protein C for 60 
minutes at 37''C. Aliquot s of the activated material were 

35 then compared for their ability to prolong the clotting 
time of protein C depleted human plasma (Diagnostic 
Reagents Ltd) in the presence of activated partial 
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thromboplastin time reagent - cephalin from rabbit brain 
(Sigma) and calcium using a mechanical coagulometer 
(Diagnostica Stago, Asmieres, FR) . A comparison of 
clotting times with various additions of transgenic and 
5 plasma-derived protein C (Figure 3) shows that the two 
preparations had the same ant i- coagulant activity per tng 
of protein. 

In summary, results show that the sheep-derived 
transgenic protein C is correctly post-translationally 

10 processed, with respect to gamma-carboxylation and 
probably beta-hydroxylation, and has anticoagulant 
activity fully equivalent to a high quality purified 
plasma standard. The results demonstrate that the C- 
terminal processing of the light chain, with the modified 

15 RRKR cleavage site rather than the naturally occurring KR 
site, has the two extra basic amino acids removed along 
with the natural ones. 

From the foregoing, it will be appreciated that, 
20 although specific embodiments of the invention have been 
described herein for purposes of illustration, various 
modifications may be made without deviating from the 
spirit and scope of the invention. Accordingly, the 
invention is not limited except as by the appended claims. 

25 
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SEQUENCE LISTING 



(1) GENERAL INFORHATION: 

(i) APPLICANTS: ZymoGenetics. Inc. 

1201 Eastlake Avenue East 

Seattle 

WA 

USA 

9B102 

PPL Therapeutics 

Roslin 

Edinburgh 

Scotland 

UK 

EH25 9PP 

TITLE OF INVENTION: PROTEIN C PRODUCTION IN TRANSGENIC 
ANIMALS 

(111) NUMBER OF SEQUENCES: 25 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ZymoGenetics. Inc. 

CB) STREET: 1201 Eastlake Avenue East 

CO CITY: Seattle 

(D) STATE: WA 

CE) COUNTRY: USA 

(F) ZIP: 98102 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0. Version #1.25 

(Vl) CURREffT APPLICATION DATA: 

(A) APPLICATION NIWBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(V111) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Sawlslak, Deborah A 
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(B) REGISTRATION NUMBER: 37.438 

(CI REFERENCE/DOCKET NUMBER: 95-28PC 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 206-442-6672 

(B) TELEFAX: 206-442-6678 



(2) INFORMATION FOR SEC ID N0:1: 

CO SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11725 base pairs 
(8) TYPE: nucleic acid 
(CI STRANDEDNESS: double 
(0) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 



(IX) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join (3520. .3630. 5093.. 5117. 5210.. 5347. 5450 

..5584. 8253.. 8395. 9269. 9386. 10516. .11102) 



(x1) SEQUENCE DESCRIPTION: SEQ ID NOil: 



AfTTGAATaG 


GGCGAGTAAC 


ACAAAACHG 


AGTGTCCTTA 


CCT6AAAAAT 


AGAGGTTAGA 


60 


QGGATGCTAT 


GTGCCAHGT 


GTGT6TGTGT 


TGGGGGTGGG 


GAHGCGGGT 


GATTTGTGAG 


120 


CAATTGGAGG 


TGAffiGTGGA 


GCCCAGTGCC 


CAGCACCTAT 


GCACTGGGGA 


CCCAAAAAG6 


180 


AGCATCTTCT 


CATGATTTTA 


TGTATCAGAA 


AHGGGATGG 


CATOTCAHG 


GGACAGCGTC 


240 


llllilCIIG 


TATGGTGGCA 


CATAAATACA 


TGTGTCTTAT 


AATTAATQGT 


ATTTTAGAn 


300 


TGACGAAATA 


TG6AATATTA 


CCTG7TGTGC 


T6ATCTTGGG 


CAAACTATAA 


TATCTCTGQG 


360 


CAAAAATGTC 


CCCATCTGAA 


AAACAGGGAC 


AACGTTCCTC 


CCTCAGCCAG 


CCACTATGGG 


420 


GaAAAATGA 


6ACCACATCT 


GTCAAGGGTT 


HGCCCTCAC 


CTCCCTCCCT 


GCTGGATGGC 


480 


ATCCnGSTA 


GGCA6AGGTG 


GGCTTCGQQC 


AGAACAAGCC 


GTGaGftGCT 


AGGACCAGGA 


540 
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GT6CTAGTGC CACTGTITGT CTATGGAGAG GGAGGCCTCA GTCaCAGGG CCAfiGCAAPJ 600 

AITTGTGGTT ATGGATTAAC TCGAACTCCA GGCTGTCATG GCGGCAGGAC GGGGA/vCHG 660 

CAGTATCTCC ACGACCC6CC CCTGT6AGTC CCCaCCAGG CAGGTaATG AfiGGGlGTGG 720 

AGGGAGGGCT GCCCCCGGGA GAAGAGAGCT AG6TGGT6AT GAGGGQGAA TCQCCAGCC 780 

AGQGTGCTCA ACAAGCCT6A GCnOGGGTA AAAGGACACA AGGCCCTCCA CAGGCCAGGC 840 

CTGGCAGCCA CAGTCTCAGG TCCCTTTGCC ATGCGCCTCC CTCTTTCCAG GCCAACGGTC 900 

CCCAGGCCCA GGGCCATTCC AACAGACAGT nGGAGCCCA GGACCCTCCA HCTCCCCAC 960 

CCCACTTCCA CCTTTQGGGG TGTCGGATn GAACAAATCT CAGAAGCGGC CTCAGPGQ6A 1020 

GTCGGCAAGA A1GGAGAGCA GGGTCCGGTA GGGTGTGCAG AGGCCACGTG 6CCTATCCAC 1080 

TGGQGAGGGT TCCTTGATCT CTGGCCACCA GQGCTATCTC TGTGGCCnT TGGAfiCAACC 1140 

TGGT6GTTTG GGGCA6GGGT TGAATTTCCA GGCCTAAAAC CACACAGGCC TGGCaTGAG 1200 

TCCTGGCTCT 6CGAGTAATG CATGGATGTA AACATGGAGA CCCAG6ACCT TGCaCAGTC 1260 

TTCCGAGTCT 6GTGCCTGCA GTGTACTGAT GGTGT6AGAC CCTACTCCT6 GAQ6AT3GGG 1320 

GACAGAATCr 6ATC6ATCCC CTGGGnGGT GACTTCCCTG TGCAATCAAC GGAGACCAGC 1380 

AA6GGTTG6A TTTTTAATAA ACCACHAAC TCCTCCGAGT CTCAGTTTCC CCCTCTM6A 1440 

AATQQBGTTG ACAGCATTAA TAACTACCTC TTGGGTGGn GTGAGCCTTA ACTGAA5TCA 1500 

TAATATCTCA TGTTTACTCA GCAT6AGCTA TGTGCAAAGC CTGrmTGAG AGCTTT'^TGT 1560 

GGAaAACTC CTTTAATTCT CACAACACCC TTTAAGGCAC AGATACACCA CGTTATTCCA 1620 

TCCATTTTAC AAATGAGGAA ACTGAGGCAT GGAGCAGTTA AGCATCHGC CCMCATTGC 1680 

CCTCCAGTAA GTGCTGGAGC TGGAAnTGC ACCGT6CAGT CTGGCnCAT GGCCTGXCT 1740 

GTGAATCCTG TAAAAAHGT HGAAAGACA CCATGAGTGT CCAATCAAC6 TTAGCTMTA 1800 

rraCAGCCC AGTCATCAGA CCGGCAGAGG CAGCCACCCC ACTGTCCCCA GGGAGB^CAC 1860 

AAACATCCTG GCACCCTCTC CACTGCA7TC TGGAGaGCT TTCTAGGCAG GCAGTffrGAfi 1920 



Printed from Mimosa 06/03/1998 13:49:50 page -52- 



wo 97/20043 



51 



PCT/US96/I8866 



CTCAGCCCCA CGTAGAGCGG GCAGCCGAGG CCHCTGAGG QATCTaCT AGCGAACAAG 1980 

GACCCTCAAT TCCAGCTTCC GCCT6ACGGC CAGCACACAG GGACAGCCCT HCATTCCGC 2040 

nCCACCTGG GGGTGCA6GC AGAGCAGCAG CGGGGGTAGC ACTGCCCGGA GQCAGAAGT 21Q0 

CCTCCTCA6A CAGGTGCCAG TGCCTCCAGA ATGTGGCAGC TCACAAGCCT CCTGCTGTTC 2160 

GTGGCCACCT GGGGAAITTC CGGCACACCA GCTCCTCHG GTAAGGCCAC CCCACCCaA 2220 

CCCCGGGACC CTTGTG6CCT CTACAAGGCC CTGGTG6CAT CTGCCCAGGC CHCACAGCT 2280 

TCCACCATCT CTCT6AGCCC TCGGTGAGGT 6AGGGGCAGA TGGGAATGGC AG6AATCAAC 2340 

TGACAAGTCC CAGGTAGGCC AGCTGCCAGA GTGCCACACA 6GGGCTGCCA GG6CAGGCAT 2400 

GCGT6AT6GC AGGGAGCCCC GCGAT6ACCT CCTAAAGCTC CaCCTCCAC ACGGGGATGG 2460 

TCACAGAGTC CCaGGGCCT TCCCTCTCCA CCCACTCACT CCCTCAACTG TGAAGACCCC 2520 

AQ6CCCAGGC TACCGTCCAC ACTATCCAGC ACAGCCTCCC CTACTCAAAT GCACAaGGC 2580 

CTCATGGCTG CCCTGCCCCA ACCCCTTTCC TGGTCTCCAC AGCCAACQQG AGGAGGCCAT 2640 

GATTCTTGGG GAGGTCCGCA GGCACATGG6 CCCCTAAAGC CACACCAGGC TGnGGTTTC 2700 

AnTGTGCCT TTATAGAGCT GTTTATCTGC TTGGGACCTG CACCTCCACC CTTTCCCAAG 2760 

GTGCCCTCAG CTCAGGCATA CCCTCCTCTA GGATGCCTTT TCCCCCATCC CTTCTTGCTC 2820 

ACACCCCCAA OTGATCTCT CCCTCCTAAC TGTGCCCTGC ACCAAGACAG ACACTTCACA 2880 

6AGCCCAGGA CACACCTGGG GACCCTTCCT GGGT6ATAGG TaGTCTATC CTCCAGGTGT 2940 

CCCTGCCCAA GGGGAGAAGC ATGGGGAATA CTTGGTTGGG G6AGGAAAGG AA6ACT6QGG 3000 

GGATGTGTCA AGATGGGGCT GCAT6TGGTG TACTGGCAGA AGAGTGAGAG GATTTAACTT 3060 

GGCAGCCTTT ACAGCAGCAG CCAGGGCHG AGTACTOTC TCTGGGCCAG GaGTATTGG 3120 

ATGTTnACA TGACGGiaC ATCCCCATGT nTTQGATGA GTAAAHGAA CCTTAGAAAG 3180 

GTAAA6ACAC TGGCTCAAGG TCACACAGAG ATCGGGGTGG GGHCACAGG GABGCCTGTC 3240 
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CATCTCAGAG CAAGGCHCG TCCTCCAACT GCCATCTGCT TCCTSSQGAG GAAW£AGCA 3300 

GAGGACCCa GCGCCAAGCC ATGACCTAGA ATTAGAAT6A GTCHGA^ GGCGGiAGACA 3360 

AGACCnCCC AGGCTCTCCC AGCTCTGCTT CaCAGACCC CCTCAWC CCAGCCCCTC 3420 

TTAGGCCCCT CACCAAGGTG AGCTCCCCTC CCTCCAAAAC CAGACTCAGT GnCICCAGC 3480 

AGC6AGC6TC CCCACCAQGT 6CTGCQGATC CGCAAACGT 6CC AAC TCC HC CT6 3534 

GAG GAG aC CGT CAC AGC AGC CTG GAG CGG GAG TdC ATA GAG GAG ATC 3582 

TGT GAC TTC GAG GAG GCC AAG GAA AH TTC CAA AAT GTG GAT GAC ACA 3630 

GTAAGGCCAC CATGGGTCCA GAGGATGAQG CTCAGG6GCG AGCTGGTAAC CAGCAGGGGC 3690 

CTC6AQ6AGC AGGTGGGGAC TCAATGCTGA GGCCaCTTA GGAGTTGTQG GGGTGGCT6A 3750 

GTGGAGC6AT TAGGATGCTG GCCCTATGAT GTCGGCCAGG CACATGTGAC TGCAAGAAAC 3810 

AGAATTCAGG AAGAAGCTCC AGGAAAGAGl GTGGGGTGAC CCTAGGTGGG GACTCCCACA 3870 

GCCACAGTGT AGGTGGHCA GTCCACCCTC CAGCCACTGC TGAGCACCAC TGCQCCCCG 3930 

TCCCACaCA CAAAGAGGGG ACCTAAAGAC CACCCT6CTT CCACCCAT6C aCT6:JGAT 3990 

CAGGGTGTGT GTGT6ACC6A AACTCACTTC TGTCCACATA AAATC6CTCA aaGTGCCT 4050 

CACATCAAAG GGA6AAAATC TGAHGHCA GGGGGTCGGA AGACAGGGTC TGTGTXTAT 4110 

TTGTCTAAGG GTCAGAGTCC TTTGGAGCCC CCA6AGTCCT GTGGACGTQG CCCTA3GTAG 4170 

TAGQ6TGAGC nGGTAACGG GGCIGGCHC CTGAGACAAG GCTCAGACCC GCTCT3TCCC 4230 

TGQG6ATCGC HCAGCCACC AGGACCTGAA AATTGTGCAC GCCTQG6CCC CCTTCIWM^ 4290 

CATCCAGGGA TGCTTTCCAG TGGAGGCTTT CAQGGCAGGA GACCCTCTG6 CCTGC'tt:CCT 4350 

CTCTTGCCCT CAGCCTCCAC CTCCHGACT GGACCCCCAT CTG6ACCTCC ATCCC:ACCA 4410 

CCnCTTTCCC CA6TGGCCTC CCT6GCAGAC ACCACAGTGA CTHaGCAG GCACAFATCT 4470 

GATCACATCA AGTCCCCACC 6T6CTCCCAC CTCACCCATG GTCTCTCAGC CCCAG^GCC 4530 

TTGGCTG6CC TCTCTGATGG AGCAGGCATC AGGCACAGGC C6TGGGTCTC AACGT'SQGCT 4590 
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GGGTQGTCCT GGACCAGCAG CAGCCGCCGC AGCAGCAACC CTQGTACCTG GTTAGGMCG 4650 

CAGACCCTCT GCCCCCATCC TCCCAACTCT GAAAAACACT GGCHAGGGA AAGGCGCGAT 4710 

GCTCAGGQGT CCCCCAAAGC CCGCAGGCAG AGGGAGT6AT GG6ACTGGAA GGAGGCCGAG 4770 

TGACnGGTG AGGGATTCGG GTCCCHGCA TGCAGAQGCT GC7GTGGGAG CGGACAGTCG 4830 

C6AGAGCAGC ACT6CAGCTG CATGGG6AGA GGGTGTTGCT CCAGG6ACGT GQ6AT6GAGG 4890 

CTGGGCGCGG GCGGGTGGCG CTG6AGGGC6 GQ6GA6GGGC AGGGAQCACC AGCTCCTAGC 4950 

AGCCAACGAC CATCGGGCGT CGATCCCTGT HGTCIGGAA GCCCTCCCCT CCCCTGCCCG 5010 

CTCACCC6CT GCCCTGCCCC ACCCGGGCGC GCCCCTCCGC ACACCGGCTG CAGGAGCaG 5070 

ACGCTGCCCG CTCTCTCCGC AG CTG GCC TTC TGG TCC AAG CAC GTC G 5117 

GTGAGTGCGT TCTAGATCCC CGGCTGGACT ACCGGCGCCC QCGCCCCTCG GGATCiaGG 5177 

CCQCTGACa CCTACCCCGC CnGTGTCGC AG AC GGT GAC CAG TGC HG GTC 5229 

TTG GCC TT6 GAG CAC CCG TGC GCC AGC CTG TGC TGC GG6 CAC GGC AC6 5277 

TGC ATC GAC GGC ATC GGC AGC TTC AGC TGC GAC TGC C6C AGC GGC TGG 5325 

GAG GGC CGC TTC TGC CAG CGC G GTGAGGGGGA GAGGTG6AT6 CTGGCGGGC6 537/ 

GCQGG6CGG6 GCTGGGGCCG GGHGGGGGC GCGGCACCAG CACCAGCTGC CCGCGCCCTC 5437 

CCCTGCCCGC AG AG GTG AGC HC CTC AAT TGC TCT CTG GAC AAC GGC 5484 

GGC TGC ACG CAT TAC TGC CTA GAG GAG GTG GGC TGG CG6 CGC T6T AGC 5532 

TGT GCG CCT GGC TAC AAG CTG GGG GAC GAC ac CTG CAG TGT CAC CCC 5580 

GCA G 6TGAGAAGCC CCCAATACAT C6CCCAQGAA TCACGCTGGG T6CGGGGTGG 5634 

GCAGGCCCCT GACGGGCGCG GCGCGGGGGG CTCAGGAGGG TTTCTAGGGA GGGAGCGAGG 5694 

AACAGAGTTG AGCCTTGOGG CAGCGGCAGA C6CGCCCAAC ACCGGGGCCA CTGHAGCGC 5754 

AATCAGaCG G6AGCTGGGC GCGCCCTCC6 CTTTCCCTGC TTCCTTTCTT CCTGGCGTCC 5814 
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CCGCnCCTC Ci^CGCCCC TGCGACCTGG GGCCACCTCC TGGAGCGCAA GCCCAoTGGT 5874 

GQCrCCGCTC CCCA6TCTGA GCGTATCTGG GGCGAG6CGT GCAGCGTCCT CCTCC 5934 

GCCTGQCTGC GTTTTTCTCT GACGrTCTCC GGCGTGCATC GCATTTCCCT CTTTAZCCCC 5994 

TTGCTTCCn GAGGAGA6AA CAGAATCCCG AHCTGCCn CTTCTATATT nCCTmTA 6054 

TGCATTTTAA TCAAATTTAT ATAT6TATGA AACTHAAAA ATCAGAGTTT TACAAi:TCTT €114 

ACAcmcAG catgctghc cttggcatgg GTCcmrrr CArrcArnr cataa,\aggt 6174 

GGACCCTTTT AATGTGGAAA TTCCTATCn aCCCTCTAG GGCATRATC ACnATTTCT 6234 

TCTACAATCT CCCCTTTACT TCCTCTATTT TCTCTTTCTG 6ACCTCCCAT TATTC/\6ACC 6294 

TCTTTCaCT AGTTTTATTG TCTCHCTAT nCCCATCTC TTTGACTTTG TGTTTXrnr 6354 

CAQ6GAACTT TCI 1 1 III 1 1 CMIIIIIII GAGATGGAGT TTCACTCnG HGTCCCAGG 6414 

aGGAGTGCA AT6ACGT6AT CTCAGaCAC CACAACCTCC GCCTCaGGA TTCAAtlCGAT 6474 

TCTCCTGCCG CAGCCTCCCG AGTA6CT6GG AHACAGGCA TGC6CCACCA CGCCCAGaA 6534 

ATTTTGTGTT TTTAGTAGAG AAGGGGTnc TCCGTGTTGG TCAAGCTGGI CHGA/^CTCC 6594 

TGACaCAGG TGATCCACCT GCCnGGCCT CCTAAAGTGC TGGGATTACA QGCGTCiAQCC 6654 

ACC6C6CCCA GCCTCTTTCA GGGAACTTTC TACAACTTTA TAATTCAAn CTTCTCiCAGA 6714 

AAAAAATTTT TGGCCAG6CT CA6TAGCTCA GACCAATAAT TCCAGCACTT TGAGA£GCT6 6774 

AeGTGQGAGG ATIGCHGAG CTTGGGAGTT TGAGACTAGC CTGGGCAACA CAGTG/^IACC 6834 

CTGTCTCTAT TTTTAAAAAA AGTAAAAAAA CATCTAAAAA TrTAACTTrT TATTTIGAAA 6894 

TAATTAGATA TTTCCAGGAA GCTGCAAAGA AATGCCTGGT GQGCCTGTTG GCTGTGlQGn 6954 

TCCTGCAAGG CCGTGGGAAG 6CCCTGTCAT TGGCAGAACC CCAGATCGTG AGGGaTTCC 7014 

mTAGGCTG CTTTCTAAGA GGACTCCTCC AAGaCHGG AQGATGGAA6 ACGCTCACCC 7074 

ATBGTGnCG GCCCCTCAGA GCAGGGTGGG GCAGG6GAGC TGGTGCaGT GCAGGCTGTG 7134 

GACATHGCA TGACTCCaG TGGTCAGCTA AfiAGCACCAC TCCTTCCTGA AGCGGGGCCT 7194 
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6AAGTCCCTA 6TCAGAGCCT CTGGnCACC TTCTGCAGGC AGGGAGAQGG 6AGTCAAGTC 7254 

AGTGAGSAGG GCTTTCGCAG TTTCTCnAC AAACTCTCAA CATGCCCTCC CACCTGCACT 7314 

GCCnCCTGG AAGCCCCACA GCCTCCTATG GTTCCGTGGT CCAGTCCTTC AGCTTCTGGG 7374 

CGCCCCCATC ACGGGCT6AG ATTTTTGCTT TCCAGTCTGC CAAGTCAGTT ACTffrGTCCA 7434 

TCCATCTGCT GTCAGCHCT GGAAnGTTG CTGnGTGCC CTTTCCATTC TTTTGTTAT6 7494 

ATGCAGaCC CCTGCTGACG ACGTCCCAH GCTCTTTTAA GTCTAGATAT CTGGACTQB6 7554 

CAHCAASGC CCATTTTGAG CA6AGTCGGG CTGACCTTTC AGCCCTCAGT TCTCCATGGA 7614 

GTATGCGCTC TCHCnGGC AGQGAGGCCT CACAAACATG CCATGCCTAT TGTAGCAOa 7674 

CTCCAAGAAT GCTCACCICC nCTCCCTGT AATTCCmC aCTCTGAGG AGCTCAGCAG 7734 

CATCCCATTA TGAGACC™ CTAATCCCAG GGATCACCCC CAACAGCCCT GGGffTACAAT 7794 

GAGCmTAA GAAGTTTAAC CACCTATGTA AGGAGACACA GGCAGTGQGC GATGCTSCCT 7854 

GGCCTGACTC TTGCCATTGG GTGGTACTGT nGHGACTG ACTGAaCAC TGACTGGAGG 7914 

GGGTTTGTAA TTTGTATCTC AGGGATTACC CCCAACAGCC CTQGGGTACA ATGAGCCHC 7974 

AAGAAGTTTA ACAACCTATG TAAGGACACA CAGCCAGTGG GTGATGCTGC CTGGTCTGAC 8034 

TCTTGCCATT CA6TGGCACT GITTGTTGAC TGACTGACTG ACTGACTGGC TGACTGGAGG 8094 

QGGTTCATAG CTAATAHAA TGGAGTSGTC TAAGTATCAT TGGnCCHG AACCCTGCAC 8154 

TGTQGCAAAG TGGCCCACAG GCTGGAGGAG GACCAAGACA GGAQQ6CAGT CTCGGGAGGA 8214 

GTGCaSGCA GGCCCCTCAC CACCTCTGCC TACCTCAG TG AAG HC CCT TGT 8266 

GOG AG6 CCC TGG AAG CGG AT6 GAG AAG AAG CGC AGT CAC CTG AAA C6A 8314 

GAC ACA 6AA GAC CAA GAA 6AC CAA GTA GAT CCG CGG CTC ATT GAT G6G 8362 

AAG AT6 ACC AGG CGG GGA GAC AGC CCC TGG CAG GTGGGAGGCG AG6CAGCACC 8415 

GOaCGTCAC GTGCTGGGTC CGGGATCAQ 6AGTCCATCC TGGCACaAT GCTCAGQGTG 8475 
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CAGAAACCGA GAGGGAAGCG CTGCCAHGC GTTTGGGGGA TGATGAAGGT GGGG&TGCT 8535 

TCAGGGAAAG ATGGACGCAA CCTGAGGGGA GAGGAGCAGC CAGGGTGGGT GAGGGGAQGG 8595 

GCATGGGQGC ATGGAGG^T CTGCAG6AGG GAGGGHACA GTTTCTAAAA AGAGC7GGAA 8655 

AfiACACTGa CTGCTGGCGG GATTTTAGGC AGAAGCCCTG CTGATGG6AG AGGGCTAG6A 8715 

GG6AGGGCC6 GGCCTGAGTA CCCCTCCAGC CTCCACATG6 GAACT6ACAC HACIGGGn 8775 

CCCCTCTCTG CCAGGCATGG GGGAGATAGG AACCAACAAG TGGGAGTATT TGCCCTGGQG 8835 

AaCAGACTC TGCAAGGGTC AGGACCCCAA AGACCCGGCA GCCCAGTGQG ACCACAGCCA 8895 

G6ACGGCCCT TCAAGATAGG GGCTGAGGGA GGCCAAGGGG AACATCCAGG CAQCCTGGGG 8955 

GCCACAAAGT CnCCTGGAA 6ACACAAGGC CTGCCAAGCC TCTAAGGATG AGAGGAGCTC 9015 

GCTQGGCGAT GTTGGTGTGG CTGAGGGTGA CTGAAACAGT ATGAACAGTG CAGGAACAGC 9075 

ATGGGCAAAG GCA6GAAGAC ACCCTGGGAC AGGCTGACAC TGTAAAATGG GCAAAAATAG 9135 

AAAACGCCAG AAA6GCCTAA GCCTATGCCC ATAT6ACCAG GGAACCCAGG AAAGTGCATA 9195 

T6AAACCCAG GTCCCaGGA CTGGAGGCTG TCAG6AGGCA 6CCCTGTGAT GTCATCATCC 9255 

CACCCCATTC CAG 6TG GTC CTG CTG GAC TCA AAG AAG AAG CTG GCC TGC 9304 

GGG GCA GTC CTC ATC CAC CCC TCC 6TG CTG ACA GCG GCC CAC T3C 9352 

ATG GAT GAG TCC AAG AAG CTC CTT 6TC AGG CTT 6 GTATGGGCTG 9396 

GACCCABGCA 6AAGGGGGCT GCCAGAGGCC TGGGTAGGG6 6ACCAQGCAG Ga6n::AGG 9456 

TTTGGGG6AC CCCGCTCCCC AGGTGCHAA GCAAGAG6CT TCHGAGaC CACAGA^QGJ 9516 

GTTTQGGGGG AAGAGGCCTA TGT6CCCCCA CCCTGCCCAC CCAT6TACAC CCAGTArTTT 9576 

GCAGTAGGGG GnCTCTGGT GCCCTCTTCG AATCTGQGCA CAGGTACCTG CACACAZATG 9636 

TTTGTGAGGG GCTACACA6A CCTTCACCTC TCCACTCCCA CTCATGAGGA GCAGGCFGTG 9696 

TGGGCCTCAG CACCCnGGG TGCAGAGACC AGCAAGGCCT GGCaCAQGG CTGTQCCTCC 9756 

CACAGACTGA CAGGGATGGA GCTGTACAGA GGGAGCCCTA GCATCTGCCA AAGCCAilAAG 9816 
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CTGCTrCCCT AGCAGGCTGG GQGCTCCTAT 6CATTQGCCC CGATaATGG CAATTTCTGG 9876 

AGQGGGGGTC TGGCTCAACT CTTTATGCCA AAAAGAAGGC AAAGCATATT GAGAAASQCC 9936 

AAATTCACAT TTCCTACAGC ATAATaATG CCAGTGGCCC C6TGQGGCTT GGCTTA6AAT 9996 

TCCCAGGTGC TtTTCCCAGG GAACCATCAG TCTE6ACT6A GAQGACCHC TCiaCAGGT 10056 

GGGACCCQGC CCTGTCaCC CTGGCAfiTGC CGTGfrCTGG GGGTCCTCCT CTCTGGGTCT 10116 

CACTGCCCa GGGGTCTCTC CAGCTACCTT TGCTCCAT6T TCCTTTGTGG CTCTGGTCTG 10176 

T6TCTGGGGT TTCCAGQGGT CTCGGGCHC CCTGCTGCCC ATTCCTTCTC TGGTCTCACG 10236 

GCTCCGTGAC TCCTGAAAAC CAACCAGCAT CCTACCCCH TQGAnGACA CCTGTTQGCC 10296 

AaCCTTCTG GCAGGAAAAG TCACCGHGA TAGQGTTCCA CGGCATAGAC AGGTG6CTCC 1D356 

GCGCCA6T6C CTGQGACGTG TG6GTGCACA 6TCTCCQQGT GAACCnCTT CAGGCCaCT 10416 

CCCAGGCCTG CAGG6GCACA 6CAGTQGGTG GGCCTCAGGA AAGTGCCACT 6QGGA6AGGC 10476 

TCCCCGCAGC CCACTCT6AC TGTGCCCTCT GCCCTGCAG GA GAG TAT GAC CTG 10529 

CGG CGC TGG GAG AAG TGG GAG CTG GAC CTG GAC ATC AAG GAG GTC HC 10577 

6TC CAC CCC AAC TAG AGC AAG AGC ACC ACC GAC AAT GAC ATC OCA CTG 1062S 

CTG CAC QG GCC CAG CCC GCC ACC CTC TCG CAG ACC ATA GT6 CCC ATC 10673 

TGC CTC CCG GAC AGC G6C CTT GCA GAG CGC GAG CTC AAT CAS GCC GGC 10721 

CAG GAG ACC CTC GTG ACG GGC TGG GGC TAC CAC AGC AGC CGA GAG AAG 10769 

GAG GCC AAG AGA AAC CGC ACC TTC GTC CTC AAC TTC ATC AAG ATT CCC 10817 

GTG GTC CCG CAC AAT GAG TGC AGC GAG GTC AT6 AGC AAC ATG GTG TCT 10865 

GAG AAC ATG CTG TGT 6C6 GGC ATC CTC GOG GAC CGG CAG GAT GCC TGC 10913 

GAG GGC GAC AGT GGG GGG CCC ATG GTC GCC TCC HC CAC GGC ACC TGG 10961 

nc CTG GTG GGC CTG GTG AGC TGG GGT GAG GGC TBT GGG CTC CH CAC 11009 
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AAC TAC G6C GTT TAC ACC AAA GTC A(3C CGC TAC CTC GAC TOG ATC CAT 


11057 


GOG CAC ATC AGA GAC AAG GAA GCC CCC CAG AAG AGO TOG GCA CO 


11102 


TAGCGACCCT CCCTGCAGGG CTQGGCTTTT 


GCATGGCAAT 66ATGG6ACA TTAAAX6AC 


11162 


ATGTAACAAG CACACCGGCC TGCTGHCTG 


TCCnCCATC CCTCTTTTGG GCTCTfCTGC 


11222 


AGGGAAGTAA CATTTACTGA GCACCTGTTG 


TATGTCACAT GCCHATGAA TAGAArCTTA 


11282 


ACTCCTAGAG CAAaCTGTG QGGTGGG6AG 


GAGCAGATCC AAGTTTTGCG GQGTCTAAAG ' 


11342 


CTCTGTGTCT TGAGGGG6AT ACTCTGTTTA 


TGAAAAAGAA TAAAAAACAC AACCAi:GAAG 


11402 


CCAQAGAGC CTmCCAGG GCTTTGGGAA 


GAGCCTGTGC AAGCCGGGGA TGCT&\AGGT 


11462 


GAGGCTTGAC CAGCTTTCCA GCTAGCCCAC 


CTATGAGGTA GACATGTTTA GCTCATATCA 


11522 


CAGAGGAG6A AACTGAGGGG TCTGAAAGGT 


HACATGGTG GAGCCAGGAT TCAAATTAG 


11582 


GTCTGAaCC AAAACCCAGG TGC! 1 1 1 1 IC 


TGHCTCCAC TGTCCTGGAG GACAGCTGTT 


11642 


TC6ACGGTGC TCAGT6TGGA GGCCACTAH 


AGCTCTGTAG GGAAGCAGCC AGAGACCCAG 


11702 


AAAGTGHGG TTCAGCCCAG AAT 




11725 


C2) INFORMATION FOR SEQ ID N0:2: 


)' 





(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 460 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



MOLECULE TYPE- protein 

<x1) SEQUENCE DESCRIPTION; SFQ ID N0:2: 

Met Trp Gin Leu Thr Ser Leu Lou Leu Phe Val Ala Thr Trp Gly lie 
15 10 15 

Ser Gly Thr Pro Ala Pro Leu Asd Ser Val Phe Ser Ser Ser Glu frg 
20 ■ 25 30 
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Ala His Gin Val Leu Arg He Arg Lys Arg Ala Asn Ser Phe Leu Glu 
35 40 45 

Glu Leu Arg His Ser Ser Leu Glu Arg Glu Cys He Glu Glu He Cys 
50 55 60 

Asp Phe Glu Glu Ala Lys Glu He Phe Gin Asn Val Asp Asp Thr Leu 
65 70 75 80 

Ala Phe Trp Ser Lys His Val Asp Gly Asp Gin Cys Leu Val Leu Pro 
85 90 95 

Leu Glu His Pro Cys Ala Ser Leu Cys Cys Gly His Gly Thr Cys lie 
100 105 110 

Asp Gly He Gly Ser Phe Ser Cys Asp Cys Arg Ser Gly Trp Glu Gly 
115 120 125 

* 

Arg Phe Cys Gin Arg Glu Val Ser Phe Leu Asn Cys Ser Leu Asp Asn 
130 135 140 

Gly Gly Cys Thr His Tyr Cys Leu Glu Glu Val Gly Trp Arg Arg Cys 
145 150 155 160 

Ser Cys Ala Pro Gly Tyr Lys Lea Gly Asp Asp Leu Leu Gin Cys His 
165 170 175 

Pro Ala Val Lys Phe Pro Cys Gly Arg Pro Trp Lys Arg Met Glu Lys 
180 185 190 

Lys Arg Ser His Leu Lys Arg Asp Thr Glu Asp Gin Glu Asp Gin Val 
195 200 205 

Asp Pro Arg Leu He Asp Gly Lys Met Thr Arg Arg Gly Asp Ser Pro 
210 215 220 

Trp Gin Val Val Leu Leu Asp Ser Lys L>s Lys Leu Ala Cys Gly Ala 
225 230 235 240 

Val Leu He His Pro Ser Trp Val Leu Thr Ala Ala His Cys Met Asp 
245 250 255 

Glu Ser Lys Lys Leu Leu Val Arg Leu Gly Glu Tyr Asp Leu Arg Arg 
260 265 270 
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Trp Glu Lys Trp Glu Leu Asp Leu Asp He Lys Glu Val Phe Val His 
275 280 285 

Pro Asn Tyr Ser Lys Ser Thr Thr Asp Asn Asp He Ala Leu Leu Sis 
290 295 300 

Leu Ala Gin Pro Ala Thr Leu Ser Gin Thr He Val Pro He Cys -eu 
305 310 315 320 

Pro Asp Ser Gly Leu Ala Glu Arg Glu Leu Asn Gin Ala Gly Gin Glu 
325 330 335 

Thr Leu Val Thr Gly Trp Gly Tyr His Ser Ser Arg Glu Lys Glu Ala 
340 345 350 

Lys Arg Asn Arg Thr Phe Val Leu Asn Phe He Lys He Pro Val Val 
355 360 365 

Pro His Asn Glu Cys Ser Glu Val Met Ser Asn Met Val Ser Glu Asn 
370 375 380 

Met Leu Cys Ala Gly He Leu Gly Asp Arg Gin Asp Ala Cys Glu Gly 
385 390 395 400 

Asp Ser Gly Gly Pro Met Val Ala Ser Phe His Gly Thr Trp Phe Leu 
405 410 415 

Val Gly Leu Val Ser Trp Gly Glu Gly Cys Gly Leu Leu His Asn Tyr 
420 425 430 

Gly Val Tyr Thr Lys Val Ser Arg Tyr Leu Asp Trp He His Gly His 
435 440 445 



He Arg Asp Lys Glu Ala Pro Gin Lys Ser Trp Ala 
450 455 460 



(2) INFORMATION FOR SEQ ID N0:3- 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1385 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 
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(ID MOLECULE TVPE: cONA 



FFATURE- 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..13a0 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATG TGG CAG CTC ACA AGC CTC CTG CTG TTC GTG GCC ACC TGG GGA AH 48 
Met Trp Gin Leu Thr Ser Leu Leu Leu Phe Val Ala Thr Trp Gly He 
1 5 10 15 

TCC GGC ACA CCA GCT OCT CU 6AC TCA GTG TTC TCC AGC AGC GAG CGT 96 
Ser Gly Thr Pro Ala Pro Leu Asp Ser Val Phe Ser Ser Ser Glu Arg 
20 25 30 

GCC CAC CAG GTG CTG CGG ATC CGC AAA CGT GCC AAC TCC TTC CTG GAG 144 
Ala His Gin Val Leu Arg He Arg Lys Arg Ala Asn Ser Phe Leu Glu 
35 40 45 

GAG CTC CGT CAC AGC AGC CTG GAG CGG GAG TGC ATA GAG GAG ATC TGT 192 
Glu Leu Arg His Ser Ser Leu Glu Arg Glu Cys He Glu Glu He Cys 
50 55 60 

6AC nC GAG GAG GCC AAG GAA AH HC CAA AAT GTG GAT 6AC ACA QG 240 
Asp Phe Glu Glu Ala Uys Glu He Phe Gin Asn Val Asp Asp Thr Leu 
65 70 75 80 

GCC TTC TGG TCC AAG CAC GTC GAC QGT GAC CAG TGC HG GTC TTG CCC 288 
Ala Phe Trp Ser Lys His Val Asp Gly Asp Gin Cys Leu Val Leu Pro 
85 90 95 

TTG GAG CAC CCG TGC GCC AGC CTG TGC TGC GGG CAC GGC AC6 TGC ATC 336 
Leu Glu His Pro Cys Ala Ser Leu Cys Cys Gly His Gly Thr Cys He 
100 105 110 

GAC GGC ATC GGC AGC TTC AGC TGC GAC TGC CGC AGC GGC TGG GAG GGC 384 
Asp Gly He Gly Ser Phe Ser Cvs Asp Cys Arg Ser Gly Trp Glu Gly 
115 120 125 

CGC nC TGC CAG CGC GAG GTG AGC TIC CTC AAT TGC TCT QG GAC AAC 432 
Arg Phe Cys Gin Arg Glu Val Ser Phe Leu Asn Cys Ser Leu Asp Asn 
130 135 140 
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GGC GGC TGC ACQ CAT TAC TGC CTA GAG GAG GTG GGC TG6 CGG CGC FGT 480 
Gly Gly Cys Thr His Tyr Cys Leu Glu Glu Val Gly Trp Arg Arg i^ys 
145 150 155 160 

AGC TGT GCG CCT GGC TAC AAG CTG GGG GAC GAC CTC GTG CAE TGT tlAC 528 
Ser Cys Ala Pro Gly Tyr Lys Leu Gly Asp Asp Leu Leu Gin Cys His 
165 170 175 

CCC GCA GTG AAG TTC CCT TGT GGG AGG CCC TGG AAG CGG ATB GAG 576 
Pro Ala Val Lys Phe Pro Cys Gly Arg Pro Trp Lys Arg Met 61 u Lys 
180 185 190 

AAG CGC AGT CAC CTG AAA CGA GAC ACA 6AA GAC CAA GAA GAC CAA (rTA 624 
Lys Arg Ser His Leu Lys Arg Asp Thr Glu Asp Gin Glu Asp Gin Val 
195 200 205 

GAT CCG CGG CTC ATT GAT GGG AAG ATG ACC AGG CGG GGA GAC AGC CCC 672 
Asp Pro Arg Leu He Asp Gly Lys Met Thr Arg Arg Gly Asp Ser Pro 
210 215 220 

TGG CAG GTG GTC CTG CTG GAC TCA AAG AAG AAG CTG GCC TGC GGG CiCA 720 
Trp Gin Val Val Leu Leu Asp Ser Lys Lys Lys Leu Ala Cys Gly Ala 
225 230 235 m 

GTG CTC ATC CAC CCC TCC TGG GTG CTG ACA GCG GCC CAC TGC ATG CAT 768 
Val Leu He His Pro Ser Trp Val Leu Thr Ala Ala His Cys Het fsp 
245 250 255 

GAG TCC AAG AAG CTC CTT GTC AGG CTT GGA GAG TAT GAC CTG CGG CGC 816 
Glu Ser Lys Lys Leu Leu Val Ara Leu Gly Glu Tyr Asp Leu Arg Arg 
260 255 270 

TOG GAG AAG TGG GAG CTG GAC CTG GAC ATC AAG GAG GTC TIC GTC CAC 864 
Trp Glu Lys Trp Glu Leu Asp Leu Asp He Lys Glu Val Phe Val his 
275 280 285 

CCC AAC TAC AGC AAG AGC ACC ACC GAC AAT GAC ATC GCA CTT5 CTG CAC 912 
Pro Asn Tyr Ser Lys Ser Thr Thr Asp Asn Asp He Ala Leu Leu His 
290 295 300 

C7TG GCC CAG CCC GCC ACC CTC TCG CAG ACC ATA GTG CCC ATC TGC CTC 960 
Leu Ala Gin Pro Ala Thr Leu Ser Gin TTir He Val Pro He Cys Leu 
305 310 315 320 
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CCG 6AC AGC GGC CH GCA GAG CGC GAG CTC AAT CAG GCC 6GC CAG GAG 1008 
Pro Asp Ser Gly Leu Ala Glu Arg G1u Leu Asn Gin Ala G^y Gin Glu 
325 330 335 

ACC CTC GTG ACG GGC TG6 GGC TAC CAC AGC AGC CGA GAG AAG GAG GCC 1056 
TTir Leu Val Thr Gly Trp Gly Tyr His Ser Ser Arg Glu Lys Glu Ala 
340 345 350 

AAG AGA AAC CGC ACC TTC 6TC CTC AAC TTC ATC AAG ATT CCC GTG GTC 1104 
Lys Arg Asn Arg Thr Phc Val Leu Asn Phe He Lys He Pro Val Val 
355 360 365 

CCG CAC AAT GAG TGC AGC GAG GTC ATG AGC AAC AT6 GTG TCT GAG AAC 1152 
Pro His Asn Glu Cys Ser Glu Val Met Ser Asn Met Val Ser Glu Asn 
370 375 380 

ATG CTG TGT GCG GGC ATC CTC GfiG GAC CGG CAG GAT GCC TGC GAG GGC 1200 
Met Leu Cys Ala Gly lie Leu Gly Asp Arg Gin Asp Ala- Cys Glu Gly 
385 390 . 395 400 

GAC AST GGG GGG CCC ATG GTC GCC TCC HC CAC GGC ACC TGG TTC CTG 1248 
Asp Ser Gly Gly Pro Met Val Ala Ser Phe His Gly Thr Trp Phe Leu 
405 410 415 

GTG GGC CTG GTG AGC TGG GGT GAG GGC TGT GGG CTC CTT CAC AAC TAC 1296 
Val Gly Leu Val Ser Trp Gly G.lu Gly Cys Gly Leu Leu His Asn Tyr 
.420 425 430 

GGC GTT TAC ACC AAA GTC AGC CGC TAC CTC GAC TGG ATC CAT GGG CAC 1344 
Gly Val Tyr Thr Lys Val Ser Arg Tyr Leu Asp Trp He His Gly His 
435 440 445 

ATC AGA GAC AAG GAA GCC CCC CAG AAG AGC TGS GCA CCTTAG 1386 
He Arg Asp Lys Glu Ala Pro Gin Lvs Ser Trp Ala 
450 455 460 * 



(2) INFORMATION FOR SEQ ID NO. 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 460 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(i1) MOLECULE TYPE: pror.pin 
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(xi) SEQUENCE DESCRIPTION SEO ID N0:4: 

Met Trp Gin Leu Thr Ser Leu Leu Leu Phe Val Ala Thr Trp Gly He 
15 10 15 

Ser Gly Thr Pro Ala Pro Leu Asp Ser Val Phe Ser Ser Ser Glu Arg 
20 25 30 

Ala His 61n Val Leu Arg He Arg Lys Arg Ala Asn Ser Phe Leu Glu 
35 40 45 

Glu Leu Arg His Ser Ser Leu Glu Arg Glu Cys He Glu Glu lie Cys 
50 55 60 

Asp Phe Glu Glu Ala Lys Glu lie Phe Gin Asn Val Asp Asp Thr Leu 
65 70 75 80 

Ala Phe Trp Ser Lys His Val Asp Gly Asp Gin Cys Leu Val Leu Pro 
85 90 95 

Leu Glu His Pro Cys Ala Ser Leu Cys Cys Gly His Gly Thr Cys lie 

100 105 110 

Asp Gly lie Gly Ser Phe Ser Cys Asp Cys Arg Ser Gly Trp Glu Gly 
115 m 125 

Arg Phe Cys Gin Arg Glu Val Ser Phe Leu Asn Cys Ser Leu Asp Asn 
130 135 140 

Gly Gly Cys Thr His Tyr Cys Leu Glu Glu Val Gly Trp Arg Arg Cys 
145 150 155 160 

Ser Cys Ala Pro Gly Tyr Lys Leu Gly Asp Asp Leu Leu Gin Cys His 
165 170 175 

Pro Ala Val Lys Phe Pro Cys Gly Arg Pro Trp Lys Arg Met Glu lys 
180 185 190 

Lys Arg Ser His Leu Lys Arg Asp Thr Glu Asp Cln Glu Asp Gin Val 
195 200 205 

Asp Pro Arg Leu lie Asp Gly Lys Met Thr Arg Arg Gly Asp Ser Pro 
210 215 220 
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Trp Gin Val Val Leu Leu Asp Ser Lys Lys Lys Leu Ala Cys Gly Ala 
225 230 235 240 

Val Leu He His Pro Ser Trp Val Leu Thr Ala Ala His Cys Met Asp 
245 250 255 

Glu Ser Lys Lys Leu Leu Vol Arg Leu Gly Glu Tyr Asp Leu Arg Arg 
260 265 270 

Trp Glu Lys Trp Glu Leu Asp Leu Asp lie Lys Glu Val Phe Val His 
275 c'80 2B5 

Pro Asn Tyr Ser Lys Ser Thr Thr Asp Asn Asp He Ala Leu Leu His 
290 295 300 

Leu Ala Gin Pro Ala Thr Leu Ssr Gin Thr He Val Pro He Cys Leu 
305 310 315 320 

Pro Asp Ser Gly Leu Ala Glu Arg Glu Leu Asn Gin Ala Gly Gin Glu 
325 330 335 

Thr Leu Val Thr G)y Trp Gly Tyr His Ser Ser Arg Glu Lys Glu Ala 
340 345 350 

Lys Arg Asn Arg Thr Phe Val Leu Asn Phe lie Lys He Pro Val Val 
355 360 365 

Pro His Asn Glu Cys Ser Glu Val Met Ser Asn Met Val Ser Glu Asn 
370 375 380 

Met Leu Cys Ala Gly He Leu Gly Asp Arg Gin Asp Ala Cys Glu Gly 
385 390 395 400 

Asp Ser Gly Gly Pro Met Val Ala Ser Phe His Gly Thr Trp Phe Leu 
405 410 415 

Val Gly Leu Val Ser Trp Gly Glu Gly Cys Gly Leu Leu His Asn Tyr 
420 425 430 

Gly Val Tyr Thr Lys Val Ser Arg Tyr Leu Asp Trp He His Gly His 
435 440 445 

He Arg Asp Lys Glu Ala Pro Gin Lys Ser Trp Ala 
450 455 460 
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(2) INFORMATION FOR SEO ID NO: 5: 

(1) SEQUENCE CHARACTERISTICS. 

(A) LENGTH: 10807 hase pairs 

(B) TYPE: nucleic ,K-iri 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:5: 
ACGCGTGTCG ACCTGCAGGT CAACG6ATCT CTGTGTaGT TTTCATBHA GTACCACACT 
GTTrTGGTGG CTGTAGCnT CA6CTAC^r.T CTGAAGTCAT AAAGCCTTGGT ACCTCCAGCT 
CT6TTCTCTC TCAA6ATTGT GnCTGCTGT nCGGTCTH AGT6TCTCCA CACAATTTTT 
AGAAnGTTT GTTCTAGHC TGTGAAAAAT GATGCTGGTA nTTGATAAfi GATTGCATTG 
AATCTGTAAA GCTACAGATA TAGTCATIGG GTAGTACAGT CACTTTAACA ATATTAACTC 
nCACATaG TGAGCATGAT ATATTnCCC CCTCTATATC ATCTTCAATT CCTCCTATCA 
GMICIIICA TTGCAGTnr CTGAGTAC.AG GTCHACACC TCCTTGGTTA GAGTCATTCC 
TCAGTATTTT AHCCTHGA 1AWTJ^'(^. AATGAGGTAA TTTTCTTAGT TTaCTHCT 
GATAGQCAT TGTTAGTGTA TATATAG^.M /^I^CAACAGAT TTCTATGTAT TAATITTGTA 
TCCTGCAACA GATTTCTATG TATTAATTTT GTATCCTGCT ACTTTACGGA ATTC/riTAT 
TAGCmTTG GTGACATCn GA6GATTTTC TGAA6AAAAT GGCATG6TAT GGTACiGACAA 
GGTGTCATGT CATCT6CAAA CAGTGGCAGT TnCCncn CCCTTCCAAC aQG/,TTTCT 
TTCATnCn TCTGTCTGAG TACGACTAGG AHCCCAATA CTATACCGAA TAAA^GTGGC 
AAGAGTG6AC ATCCTTGTCT TArrrnCTG ACCTTAGAGG AAATGCTTTC AGTT7TTCAC 
CATTAATTAT AATGTTTACT GTGGGCITGT CTOTGTGGC CTTCATTATA TGGACGTaA 
TTCCCTCTAT ACCCACCTTG TTGAGAGrTT TTATCATAAA AGTATGTTGA ATTnGTCAA 

AAGrnrrcc tgcatctatt GACATGAirr hactcttca attcattaat GATrfmn 
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CTTCATnTG TTAATGATTT CCATTCTTCA ATrTGTTAAC GTGGTATATC ACAHGAnG lOBO 

ATTTGTGGAT ACCTTTGTAT CCCTGGtiATA AACCTCACTT GATCATGAfiC rTTCAATGTA 1140 

TmrrSAAn CACTTTGCTA ATAHCTGII GGGTATrnT GCATCTCTAT TCATCAATGA 1200 

TATTGGCCTA AGAAAGGTn" TGICTU-TTT TAGTATCAGG 6TGATGCTGG CaCATAGAG 1260 

AGAGTTTAGA AGCATTTCCT CCTCTTTGAT TTTTCQGAAT AGTITGAGTA QGATABGTAT 1320 

TAACTCnCT nAAATGlTT GGGGACUCC CTGGTGAGCC GGTGGnGAG AATCCGCaC 1380 

AGGGATGTQG GTTT6ATCCC TGGTCAGGGA ACCATTAATA AGATCCCACA TGCT6CAGGC 1440 

AACAAGCCCC CAAGCTGCAA GCACTGAGCT GCAACC6CTG CAGTGCCCAC AGGCCACGAC 1500 

CAGAGAAAGC CCACATACAG CAGGG>WiAC CCAGCACAAC CG6AAAAAGG AGmTGGTGG 1560 

AATACAGCTG TGAAGCCGTC TGGTCCTGGA CTCCTQCTTG AQGGAATTTT HAAAAAHA 1620 

TTGATTCAAT TTCATTACTG GTAACTGGTC TGHCATAn nCTATTTa TCCGGGHCA 1680 

GTCTTCGGAG AHGTACATG CCTAGGAATG TGICCGTHC nCTAGGTTG TCCATTTTAT 1740 

TGGACATGCA TGG6A6CACA CAGCACCGAC CAGCGAGACT CATGCTQQCT TCCTQfiGGCC 1800 

AQGaOQGGC CCCAAGCAGC ATGl^CAfCCT AGAGTGTGTG AAAGCCCACT GACCCTQCCC 1860 

AGCCCCACAA TnCATTCTG AGAAGTGATT CCTTGCnCT GCACTTACAG GCCCAGGATC 1920 

TGACCreCTT CT6AGGAGCA GGGGmiGG CAG6ACGGGG AGATGCTGAG AGCCGACGQfi 1980 

GGTCCAQGTC CCCTCCCAGG CCCCCCTGTC TG6GGCAGCC CHOGGAAAG AHGCCCCAG 2040 

TCTCCCTCCT ACAGT6GTCA GTCCCAGCTG CCCCAGGCCA GAGCTQCTTT ATnCCGTCT 2100 

CTCTCTCTGG ATGGTATTCT CTGGAAGCTG AAGGHCCTG AAGHATGAA TAGCITTGCC 2160 

CTGAAQGSCA TQGTTTGTGG TCACGGHCA CAGGAACHG GGA6ACCCTG CAGCTCAGAC 2220 

GTCCC6AGAT TGGTGGCACC CAGATTTCCT AAGCTCGCTG G6GAACAGQG CGCTT6TTTC 2280 

TCCaOGCTG ACCTCCCTCC tCCCTGCATC ACCCA61TCT 6AAAGCAGAG CQGTGCTGGG 2340 
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GTCACAGCCT CTCGCATCTA ACGCCGGTGT CCAAACCACC CGTGCTQGTG 7TCGGGGQ6C 2400 

TACaATCGG 6AAGGGCTTC TCACTGCA6T GGTGCCCCCC GTCCCCTCTG AGATCAGAAG 2460 

TCCCAGTCCG GACGTCAAAC A6GCCGAGCT CCCTCCA6AG GCTCCAGQGA GG6ATCCTTG 2520 

CCCCCCCGCT GCTGCCTCCA GCTCCTGG7G CCGCACCCTT GAGCaGATC HGTAGACGC 2580 

aCAGTCTAG TCTCT6CCTC CGTGnCACA CGCCTTCTCC CCATGTCCCC TCC6T3TCCC 2640 

CGTTTTaCT CACAAGGACA CCGGACAHA GAHAGCCCC TGITCCAGCC TCACCTGAAC 2700 

ASCTCACATC T6TAAA6ACC TA6ATTCCAA ACAAGATTCC AACCTGAAGT TCCCG3TQGA 2760 

TGIGAGHCT GGG6CGACAT CCTTCAACCC CATCACAGCT TGCAGTOAT CGCAAiWCAT 2820 

GGAACCTG6G GHTATCGTA AAACCCAGGT TCHCATGAA ACACTGAGCT TCGAGBCnG 2880 

TTGCAAGAAT TAAAG6TGCT AATACAGATC AGGGCAAGGA CTGAAGCT6G CTAAG':CTCC 2940 

TCTTTCCATC ACAGGAAAG6 GGGGCCTGGG GGCGGCTGGA GGTCTQCTCC CGTGAI3TGAG 3000 

CTCTTTCaG CTACAGTCAC CAACAGTCTC TCT6GGAA6G AAACCAGAGG CCAGA)5AGCA 3060 

AGCCGGAGCT AGITTAGGAG ACCCCTGAAC CTCCACCCAA GATGaCACC AGCCAliCGGG 3120 

CCCCCTG6AA AGACCCTACA G7TCAGGGGG GAAGAGGGGC TGACCCGCCA GGTCCCTGCT 3180 

ATCAGGAGAC ATCCCCGCTA TCAGGAGATT CCCCCACCTT GCTCCCGHC CCCTATCCCA 3240 

ATACGCCCAC CCCACCCCTG TGATGAGCAG THAGTCACT TAGAATBTCA ACTGAAGGCT 3300 

TTT6CATCCC CTTTGCCAGA GGCACAAG6C ACCCACAGCC TGCTGGGTAC CGACGtXCAT 3360 

GTGGAnCAG CCAGGAGGCC TGTCCTGCAC CCTCCCTGCT CGGGCCCCCT CTGTGCTCAG 3420 

CAACACACCC AGCACCAGCA TTCCCGCTGC TCCTGAQGTC TGCAGGCAGC TCGCTiiTAGC 3480 

CTGAGCGGTG TG6AGGGAAG TGTCCTGGGA GATTTAAAAT GTGAGAGGCG QGAQG"GGGA 3540 

GGTTGQGCCC TGTGGGCCT6 CCCATCCCAC GTGCCTGCAT TAGCCCCAGT GCTGC"CAGC 3600 

CGTGCCCCCG CCGCA6GGGT CAGGTCACH TCCCGTCCTG GGGTTATTAT GACTCTGTC 3660 

ATTGCCAnG CCATnTTGC TACCCTAACT 6GGCA6CAGG TGCTTGCAGA GCCCTCGATA 3720 
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CCGACCAGGT 


CCTCCCTCG6 AGCTCGACCT 


GAACCCCATG TCACCCTTGC 


CCCAGCCTGC 


3780 


AGAGGGTGGG 


TGACTGCA6A GATCCCHCA 


CCCAAGGCCA CQGTCACATG 


GnTGGAGGA 


3840 


GCTGGTGCCC 


AAGGCA6AGG CCACCCTCCA 


GGACACACCT GTCCCCAGTG 


CTQGCTaGA 


3900 


CCTGTCCTT6 


TCTAAGAGGC TGACCCCGGA 


AGTGTTCCTG GCACTG6CAG 


CCAGCCTGGA 


3960 


CCCAGAGTCC 


AGACACCCAC CTGT6CCCCC 


GCTTCTGGGG TCTACCA6GA 


ACCGTCTAQG 


4020 


CCCAGAQGGG 


ACnCCTGCT TGGCCTTGGA 


TGGAAGAAGG CCTCCTAnG 


TCCTCGTAGA 


4080 


GGAAGCCACC 


CCGGG6CCTG AGGATGAGCC 


AAGTGGGATT CCGG6AACCG 


C6TGGCTGGG 


4140 


GGCCCAGCCC 


6GGCTGGCTG GCCTGCATGC 


CTCCTGTATA AGGCCCCAA6 


CCTSCTGTCT 


4200 


CAGCCaCCA 


CTCCCTGCAG AGCTCAGAAG 


CAC6ACCCCA GGGATATCCC 


TGCAGCCAT6 


4260 


AAGTGCaCC 


TGCTT6CCCT 6GGCCTGGCC 


CTCGCCTGTG GCGTCCAGGC 


CATCATCGTC 


4320 


ACCCAGACCA 


T6AAAGGCCT GGACATCCAG 


AAGGHCGAG GGTTGGCCGG 


GTGGGTGAGT 


4380 


TGCAGGGCGG 


GCAGGGGAGC TGGGCCTCAG 


AGAGCCAAGA GAGGCT6TGA 


CGTTGGGTTC 


4440 


CCATCAGTCA 


GCTAGGGCCA CCTGACAAAT 


CCCCGCTGG6 GCAGCTTCAA 


CCA6GCGTTC 


4500 


ACTGTCTTGC 


ATTCTGGAGG CTGGAAGCCC 


AAGATCCAGG TGnCGCAGG 


GCTGGCTTCT 


4560 


CCTGCGGCC6 


CTCTCTGGG6 AGCAGACGGC 


CGTCnCTCC AGTCCTCTGC 


GCGCCCTGAT 


4620 


TTCCTCTTCC 


TGT6AGGCCA CCAQGCCTGC 


TGGAAACACG CCTGCCTGCG 


CAGCnCACA 


4680 


CGACCTTTGT 


CATCTCTTTA AAGGCCATGT 


CTCCAGAGTC ATGTGHGAA 


GnCTGGGGG 


4740 


TTAGTQSGAC 


ACAGHCAGC CCCTAAAAGA 


GTCTCTCTGC CCCTCAAAH 


TTCCCCACCT 


4800 


CCAGCCATGT 


CTCCCCAAGA TCCAAATGH 


GCTACATGTG GGQGGGCTCA 


TCTGGGTCCC 


4660 


TCnTGGGn 


CAGTGT6A6T CTQQGGAGAG 


CAHCCCCAG G6TGCA6AGT 


TGGGGGGAGT 


4920 


ATCTCAGGGC 


TGCCCAG6CC GGGGTGGGAC 


AGAGA6CCCA aGTGGQGCT 


GGGGGCCCCT 


4980 


TCCCACCCCC 


AGAGTGCAAC TCAAGGTCCC 


TCTCCAGGTG 6CQGGGACTT 


GGCACTCCn 


5040 
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QGCTATGGCG GCCAGCGACA TaCCCTGCT G6ATGCCCAG AGTCCCCCCC TGAG/CTGTA ' 5100 

CGTQGAQGAG CT6AAGCCCA CCCCCGAGGG CAACCTQGAG ATCCTGaGC AGAAATGGTG 5160 

QGCGTCTCTC CCCAACATGG AACCCCCACT CCCCAGQGCT GTGGACCCCC CGGGGGGTOG 5220 

GGT6CAQGAG GGACCAGQGC CCCAGGGCTG GGfiAAGAGGG aCAGAGm ACTQGTACCC 52B0 

6GCGCTCCAC CCAAGGCTGC CCACCCAG6G CTTTTTTTTT TTTTAAACTT TTAnAATn 5340 

GATGCTTCA6 AACATCATCA AACAAATGAA CATAAAACAT TCATnTTGT TTAC7TG6AA 5400 
GGGGAGATAA AATCCTCT6A AGTOAAATG CATAGCAAAG ATACATACAA TGAGGCAGGT ^ 5460 

ATTaCAAn CCCTGTTAGT aGAGGAHA CAAGTGTAn TGAGCAACAG AGAGACATTr 5520 

TCATCATTTC TAGTCTGAAC ACCTCA6TAT CTAAAAT6AA CAAGAAGTCC TGGAAAC6AA 5580 

GCAfiTGTOQG GATAGGCCCG TGTGAAGGCT GCTGGGAGGC AGCA6ACCTG GGTCTTCGGG 5640 

CTCAAQCAGT TCCCGCTACC AGCCCTGTCC ACCTCAGACG GGGGTCAQGG TGCA63A6AG 5700 

AQCTB6ATGG 6T6T66QG6C AGAGATGGGG ACCTGAACCC CAGGQCTGCC nTTG3GGGT 5760 

GCCTGTGGTC AAGGCTCTCC CTGACCTTTT CTCTaGGCT TCATCT6ACT TQCCrGGCC 5820 

CATCCACCC6 GTCCCCTGTG GCCT6AGGTG ACAGTGAGT6 CGCCGAGOa AGHGiSCCAG 5880 

CTGQCTCCTA TGCCCATGCC ACCCCCCTCC AGCCCTCCTG GGCCAGCTTC TGCCC'^GGC 5940 

CaCAGHCA TCCT6ATGAA AATGGTCCAT GCCAAT6GCT CAGAAAGCA6 CTGTCmCA 6000 

GGGAGAACGG CGAGTGTGa CAGAAGAAGA HAHGCAGA AAAAACCAAG ATCCCTGCGG 6060 

TGTTCAAGAT CGATGGTGAG TCCGGGTCCC TGGGG6ACAC CCACCACCCC CGCCOXGGG 6120 

6ACTGTGGAC AGGUCAGGG GGCTGGC6TC GGGCCCTGG6 AT6CTAA(^ ACTG6TGGTG 6180 

ATGAAGACAC TGCCTTGACA CCTGCTICAC TTGCCTCCCC TGCCACCTGC CCGQGI^CTT 6240 

GGQGCGGTCG CCAT6QGCAG GTCCC6GCTG 6CGG6CTAAC CCACCAGQGT GACACiXGAG 6300 

CTCTCTTTGC TGQGGGGCGG GCGGTGCTCT GGGCCCTCA6 GCTGAGCTCA QGAGG TACa 6360 

GTGCCCTCCC AGGGGTAACC 6AGAGCCGTT GCCCACTCCA GGGGCCCAfiG TGCCCCACGA 6420 
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CCCCAfiCCCG 


CTCCACAGCT 


CCTTCATCTC 


CTGGAGACAA 


ACTCTGTCCG CCCTCGCTCA 


6480 


TTCACTren 


CGTCCTAAAT 


CCGAGATGAT 


AAAGCTTCGA 


GGGGGGGT7G GQ6TTCCATC 


6540 


AQGGCTGCCC 


nCCGCCGGG 


CAGCCTGG6C 


CACATCTGCC 


CTTGGCCCCC TCAG6ACTCA 


6600 


CTCTGACTGG 


AGGCCCT6CA 


CTGACT6ACG 


CCAG66TGCC 


CAGCCCAGQG TCTCTGGCGC 


6660 


CATCCAGCTG 


CACTG6GTTT 


GGGTGCTGGT 


CCTGCCCCCA 


AGCTGCCCGG ACACCACAGG 


6720 


CAGCCGGGGC 


TGCCCACTGG 


CCTCGGTCAG 


GG7GAGCCCC 


AGCTGCCCCC GCTCAGQGCT 


6780 


T6CCCCGACA 


AT6ACCCCAT 


CCTCAG6ACG 


CACCCCCCTT 


CCC7T6CTGG GCAGTGTCCA 


6840 


GCCCCACCCG 


AGATCGGGGG 


AAGCCCTATT 


TCHGACAAC 


TCCAGTCCCT GGGGGAGGGG 


6900 


GCClCAGAa 


GAGTGGTGAG 


TGHCCCAAG 


TCCAGGAG6T 


GGTGGAGGGT CCTGGCSGAT 


6960 


CCAGA6TTGA 


CAGTG.AGG6C 


nCCTGGGCC 


CCATGCGCCT 


GGCAGTGGCA GCAGGGAAGA 


7020 


GGAAGCACCA 


TTTCAGGGGT 


GGGGGATGCC 


AGAGGCGaC 


CCCACCCCGT CTTCGCCGGG 


7080 


TGGTGACCCC 


GGGGGAGCCC 


CGCTG6TC6T 


G6AGG6TGCT 


GGGGGCTGAC TAGCAACCCC 


7140 


TCCCCCCCCG 


HGGAACTCA 


CTTTTCTCCC 


GTC7TGACCG 


CGTCCAGCCT TGAATGAGAA 


7200 


CAAAGTCCn 


GTGCTGGACA 


CCGACTACAA 


AAAGTACCTG 


CTCnCTGCA TGGAAAACA6 


7260 


TGCTGAGCCC 


GAGCAAAGCC 


TGGCCTGCCA 


GTGCCTGGGT 


QGGTGCCAAC CCTG6CTGCC 


7320 


CAGGGAGACC 


A6CTGCGTGG 


TCCHGCTGC 


AACAGQGGGT 


GGGGQGTGGG AGCTTGATCC 


7380 


CCAGGAGGAG 


GAGGGGTGGG 


GGGTCCaCA 


GTCCCGCCAG 


6A6A6AGTGG TCGCATACCG 


7440 


GGAGCCAGTC 


TXTGTGGGC 


CTGTGGGTGG 


CTGGGGACGG 


QGGCCA6ACA CACAGGCCGG 


7500 


GAGACGGGTG 


GGCTGG'^GAA 


CT6T6ACT6G 


TGTGACCGTC 


GCGATGGGGC CGGTQGTCAC 


7560 


TGAATCTAAC 


AGCCTTTGn 


ACCGGGGAGT 


nCAAHAH 


TCCCAAAATA AGAACTCAGG 


7620 


TACAAAGCCA 


TCTTTCAACT 


ATCACATCCT 


GAAAACAAAT 


GGCAGGTGAC ATTTTCTGTG 


7680 


CC6TAGCAGT 


CCCACT6GGC 


ATTTTCAG6G 


CCCCTGTGCC 


AGGGGGGCGC GGGCATCGGC 


7740 
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GAGTGGAGGC TCCTGGCTGT GTCAGCCGGC CCAGGGGGAG 6AAGGGACCC GGAC/ifiCCAG 7800 

AGGTGGGGGG CAGGCTTTCC CCCTGTGACC TGCAGACCCA CTGCACTGCC aCGGAGGAA 7860 

GGGAGGGGAA CTAGGCCAA5 GGGGAAGG6C AGGTGCTaG GAGGGCAAGG GCA6ACCTGC 7920 

AGACCACCCT G6GGAGCAGG 6ACTGACCCC CGTCCCTGCC CCATA6TCAG GACCCCGGA6 79B0 

GTGGACAACG AGGCCCTGGA GAAATTC6AC AAAGCCCTCA AGGCCCTGCC CATGCACATC 8040 

CGSCrrOCCT TCAACCCGAC CCAGCTGGAG GGTGAGCACC CAGGCCCC6C CCTTCCCCAG 8100 

GGCAGGAfiCC ACCCGGCCCC GGGACGACCT CCTCCCATGG TGACCCCCAG CTCCC::AGGC 8160 

CTCCCAGGAG 6AAGGG6TGG GGTGCAGCAC CCCGT66GGG CCCCCTCCCC ACCCCrPGCC 8220 

AGGCC7CTCT TCCCGAGGTG TCCAGTCCCA TCCTGACCCQ CCCATGAaC TCCCTCCCCC 8280 

ACAGQGCAGT GCCACGTCTA GGTGAGCCCC TGCCGGTGCC TCTG666TAA 6CTGCCTGCC 8340 

CTGCCCCACG TCCT1GG6CAC ACACATGGGG TAGGGGGTCT TGGTGGGGCC TGGGAiXCCA 8400 

CATCAGGCCC TGG66TCCCC CCTGTGA6AA TGGCTGGAAG CTG6GGTCCC TCCTG(jCGAC 8460 

TGCAGAGCTG GaCGCCGCG TGCCACTCTT GTGGGTGACC TGTGTCCTGG CCTCAr.ACAC 8520 

TGACCTCCTC CAGCTCCnC CAGCAGA6CT AAG6CTAAGT 6AGCCAGAAT GGTACITAAG 8580 

GCGAGGaAG CGGTCCHCT CCCGAGGAGG GGCTGTCCTG 6AACCACCAG CCATGGAGAG 8640- 

GCTGGCAAG6 6TCTGGCAGG T6CCCCAGGA ATCACAGQGG 6GCCCCATGT CCATT'CAGG 8700 

6CCC6QGAGC CnGGACTCC TCTG6GGACA GACGACGTCA CCACCGCCCC CCCCCCJ^TCA 8760 

GGGGGACTAG AAGGGACCAG GACTGCA6TC ACCCTTCCTG GGACCCAGGC CCCTCC:AGGC 8820 

CCCTCCTGQG GCTCCTGCTC TGGGCAGCTT CTCCHCACC AATAAAGGCA TAAACCTGTG 8880 

aCTCCCTTC TGAGTCTTTG CTGGAC6ACG GGCAGGG6GT GGAGAAGTGG TGGGG^fiGGA B940 

GTaGGaCA GAGGATGACA GCGGGGCTGG GATCCAGGGC GTCTGCATCA CAGTCITGTG 9000 

ACAACTGGGG GCCCACACAC ATCACTGCGG CTCTTT6AAA CTTTCAGGAA CCAGGClAffiG 9060 

ACTCGGCA6A GACATCTGCC AGTTCACTTG 6AGTGTTCAG ICAACACCCA AACTCCACAA 9120 
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AGGACAGAAA 


GTGGAAAATG 


GCTGTCTCTT 


AGTCTAAIAA 


ATATTGATAT 


GAAACTCAAG 


9180 


TTCCTCATGG 


ATCAATATGC 


CTTTATGATC 


CAGCCAGCCA 


CTACTGTCGT 


ATCAACTCAT 


9240 


GTACCCAAAC 


GCACT6ATCT 


GTCTGGCTAA 


TGATGA6AGA 


TTCCCAGTAG 


AGAGCTGGCA 


9300 


AGAGGTCACA 


GTGAGAACTG 


TCTGCACACA 


CA6CAGAGTC 


CACCAGTCAT 


CCTAAGGAGA 


9360 


TCAGTCaCG 


TGnCAnGG 


AGGACTGATG 


TTGAAGCT6A 


AACTCCAATG 


crrraoccAC 


9420 


CTGATST6AA 


6AGCTGACTC 


ATTTGAAAAG 


ACCCTGATGC 


TGGGAAAGAT 


TGAGGGCAGG 


9480 


A6GAGAAGGG 


6ACGACAGAG 


6AT6A6ATG6 


nGGATGGCA 


TCACCAACAC 


AATQGACATG 


9540 


GGTTTGGGTG 


GACTCCAG6A 


GTTGGTGATG 


GACAGGGAGG 


CCTGGCGT6C 


TACGGAAGCG 


9600 


GTTTATGGGG 


TCACAAAGAC 


T6AGTGACTG 


AACT6AGCTG 


AACTGAATGG 


AAATGAGGTA 


9660 


TACAGCAAAG 


TGGGGATnT 


HAGATAATA 


A6AATATACA 


CATAACATAG 


TGTATACTCA 


9720 


TATTTTTATG 


CATACCTGAA 


TGCTCAGTCA 


CTCAGTCGTA 


TCT6ACTCTG 


TSACCTATGG 


9780 


ACCGTAGCCT 


TCCAGGTTTC 


TTCTGTCCAC 


AGAATTCTCC 


AAGGCAAGAA 


TACTGGAGTG 


9840 


GGTTAGCCAn 


TCCTCCTCCA 


GGGGATCaC 


CCGACCCAGG 


GATTGAACCG 


GCATCTCCTG 


9900 


TATTGGCAG6 


TGGATTCTTT 


ACCACTGTGC 


CACCAGGGAA 


GCCCGT6TTA 


CTaCTATGT 


9960 


CCCACTTAAT 


TACCAAAGCT 


GCTCCAAGAA 


AAAGCCCCTG 


TGCCCTCTGA 


GCnCCCGGC 


10020 


CTGCAGAGGG 


TGGTGGGGGT 


A6ACTGT6AC 


CTG66AACAC 


CaCCCGCTT 


CAGGACTCCC 


10080 


GGGCCACGTG 


ACCCACAGTC 


CTGCAGACAG 


CCGGGTAGCT 


CTGCTCnCA 


aggctca™ 


1O140 


TCTTTAAAAA 


AAACTGAGGT 


CTATTTTGTG 


AcnccaGc 


CGTAACnCT 


GAACATCCAG 


10200 


TGCGATGGAC 


AGGACCTCCT 


CCCCAGGCCT 


CAGGGGCTTC 


AGGGAGCCAG 


CCTTCACCTA 


10260 


TGA6TCACCA 


6ACACTCGGG 


G6TGGCCCCG 


CCnCAGGGT 


GaCACAffTC 


TTCCCATCGT 


10320 


CCTGATCAAA 


GAGCAAG.ACC 


AATGACTTCT 


TyVGGAGCAAG 


CAGACACCCA 


CAGGACAaG 


10380 


AGGTTCACCA 


GAGCTGAOCT 


GTCCTTTT6A 


ACCTAAAGAC 


ACACAfiCTCT 


CGAAQGTTrr 


10440 
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aCTTTAATC TGGATTTAAG GCCTACTTGC CCCTCAAGAfi GGAAGACAGT CaGCATGTC 10500 

CCCAGGACAG CCACTCGGTG GCATCC(3AGG CCAOTAGTA TTATCTGACC GCACCCTGGA 10560 

ATTAATCQGT CCAAACTGGA CAAAAACCTT 6GTGG6AAGT TTCATCCCAG AGGCCTCAAC 10620 

CATCCT6CTT TGACCACCCT GCATCTTTTT nCTTTTATG TGTATGCATG TATA"ATATA 10680 

TATATATTTT IllilllllC ATmTTGGC TGTCaGGCT GnCGTTGCA GTTCGGTGCG 10740 

CAQGCnCTC TCTACmrCT CTCTAGTCTT CTCHATCAC AGftGCAGTCT CTAG/^CGATC 10800 

GACGCGT 10807 
(2) INFORMATION FOR SEQ ID NO: 6: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ 10 N0:6: 
AAHCCGATC 6ACGCGTC6A C6ATATACTC TAGACGATCG ACGCGTA 47 
(2) INFORMATION FOR SEQ 10 NO: 7: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID N0:7; 
AAGCTACGCG TCGATCGTCT A6AGTATATC GTCGACGCGT C6ATCGG 47 
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(2) INFORMATION FOR SEC 10 N0:8: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 



ixi) SEQUENCE DESCRIPTION: SEQ 10 N0:8: 
TGGATCCCCT GCCGGTGCCT CTGG 24 
(2) INFORMATION FOR SEQ ID N0:9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:9: 
AACGC6TCAT CCTCTGTGAG CCAG 24 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARAaERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 



(vli) IMMEDIATE SOURCE; 
(B) CLONE: ZC6839 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:10: 



ACTACGTAGT 10 
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(2) INFORMATION FOR SEO ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 
<C) STRANOEDNESS: single 
CD) TOPOLOGY: linear 



Cvil) IMMEDIATE SOURCE: 
(B) CLONE: ZCg62 



(Xi) SEQUENCE DESCRIPTION; SEQ ID N0:11: 
AGTCACCTGA GAAGAAAAC6 /WiACA ?5 
<2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: Single 

(D) TOPOLOGY: linear 



(vli) IMMEDIATE SOURCE: 
(B) CLONE: ZC6303 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12: 
ATTTGCGGCC GCCTGCAGCC ATGTGGCAGC TCACAAGCCT CCTGC 45 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 45 base pairs 

(B) TVPE: nucleic acid 
CO STRANOEDNESS; single 
CD) TOPOLOGY: linear 



Cvii) IMMEDIATE SOURCE: 
CB) CLONE: ZC6337 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
CAGGAAGGAG nCGCGCGCT TGCGCCGTTG CAGCACCTI5G TGGGC 45 
(2) INFORMATION FOR SEO ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IffllEDIATE SOURCE: 
(B) CLONE: ZC6306 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

CTTCnCCTG AAnCTGnr CTTGC 25 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 28 base pairs 

(B) TYPE; nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC6338 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:15: 



CG6ATCCGCA AGCGCGCCAA CTCCTTCC 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 28 base pairs 
(B) TYPE: nucleic acid 



28 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE; 2C6373 



(x1) SEQUENCE DESCRIPTION: SEQ ID N0:16: 
AAAGTAAAAA AAGATCTAAA AATTTAAC 28 
(2) INFORMATION FOR SEC 10 NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(VI 1) IMMEDIATE SOURCE: 
(B) CLONE: ZC6305 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GTGTCTCGH HCnCTTAA 6TGACTGCGC TT 32 
C2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(8) CLONE: 2C6302 



(x1) SEQUENCE DESCRIPTION: SEQ ID NO; 18: 
HAAGAAGAA AACGAGACAC AGAAGACCAA GAAGACCAAG TA6ATCCGC 49 
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(2) INFORMATION FOR SEQ ID NO: 19: 

(1) SEQUENCE CHARAaERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: ZC6304 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
QGATCTACn GGTCHCTTG 6TCTTCTGT6 TCTC6TTTTC HC 43 
(2) INFORMATION FOR SEQ ID N0:20: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(XI) SEQUENCE DESCRIPTION: SEQ ID N0:20: 
Arg Arg Lys Arg 



(2) INFORMATION FOR SEQ ID NO: 21: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 4 amino adds 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2l: 
Lys Arg Lys Arg 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i> SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 8 amino acids 
(8) TYPE: amino acid 
(0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:22: 

Ser His Leu Arg Arg Lys Arg Asp 
1 5 

(2) INFORMATION FOR SEO ID N0:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6763 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

ACGCGTCGAC CTGCAC-GTCA ACGGATCTCT GTGTCTGTR TCATGTTAGT ACCAD\CTGT 60 

TTT^TGGCT GTAGCTTTCA GCTACAGTCT GAAGTCATAA AGCCTGGTAC CTCCAGCTCT 120 

GHCTCTCTC AAGAHGTGT TCTGCTGHT GGGTCTTTAG TGTCTCCACA CAATTTTAG 180 

AAnGTTTGT TCTAGTTCTG TGAAAAATGA TGCTGGTAH TT6ATAAGGA TTGCA-7GAA 240 

TCTGTAAAGC TACAGATATA GTCAHGGGT A6TACAGTCA CTTTAACAAT ATTAACTCTT 300 

CACATCTGTG AGCATGATAT ATTTTCCCCC TCTATATCAT CHCAATTCC TCaA"CAGT 360 

nCTTTCATT GCAGTTITCT GAGTACAGGT CTTACACCTC CHGGTTAGA GTCAr CCTC 420 

AGTATTTTAT TCCriTGATA CAAHGTGAA TGAGGTAAH TTCTTAGTTT CTCTT'CTGA 480 

TAQCTCAHG TTAGTGTATA TATAGAAAAG CAACAGATTT CTATGTATTA ATTTTCJTATC 540 
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CTGCAACAGA TTTCTATGTA TTAATTnGT ATCCTGCTAC nTACGGAAT TCACnAHA 600 

GCmTTGGT GACATCHGA GGAfnTCTG AAGAAAATGS CATGGTATGG TAGGACAAGG 660 

TGTCAT6TCA TCTGCAAACA GIGGCAGHT TCCTTCHCC CHCCAACCT GGATTTCnT 720 

GATTTCTTTC TGTCTGAGTA CGACTAGGAT TCCCAATACT ATACCGAATA AAAGTGGCAA 780 

6AGTQGACAT CCnCTCTTA TTTTtCTGAC CTTAGAGGAA ATGCTTTCA6 TmTCACCA 840 

HAAHATAA TGTTTACTGT GGGCHGTCA TAT6TGGCCT TCATTATATG GAGGTCTATT 900 

CCaCTATAC CCACCnGTT GAGAGTITTT ATCATAAAAG TATGHGAAT TTTGTCAAAA 960 

GmnCCTG CATCTATIGA GATGATnTT ACTCHCAAT TCATTAATGA TnTTATTCT 1020 

TCATTTTGn AATGATTTCC AHCnCAAT TTGnAACGT GGTATATCAC ATTGAHGAT 1080 

TTGTBGATAC CnTGTATCC CTGG6ATAAA CCTCACHGA TCATGAGCTT TCAATGTATT 1140 

irraAATTCA CTTTGCTAAT ATTCTGTTGG GTATmTGC ATCTCTATTC ATCAATGATA 1200 

TT6GCCTAAG AAAGGTTrTG TaOGnTTA GTATCAGGGT GATGCTGGCC TCATAGAGAG 1260 

AGTTTAGAAG CATTTCCTCC TCTTTGAnT TTCGGAATAG TfTGAGTAGG ATAGGTATTA 1320 

ACTCTTCnr AAATGTTTGG GGACHCCCT GGT6AGCCGG TG6TTGAGAA TCCGCCTCAG 1380 

GGATGTGGGT TTGATCCCTG GTCAGGGAAC CAHAATAAG ATCCCACATG CTGCAGGCAA 1440 

CAAGCCCCCA ACCTGCAACC ACTGAGCT6C AACCGCTGCA GTGCCCACAG GCCACGACCA 1500 

GA6AAAGCCC ACATACAGCA GGGAAGACCC AGCACAACCG 6AAAAAGGAG TrTGGTGGAA 1560 

TACAGCTGTG AAGCCGTCTG GTCCTGGACT CCTGCTTGAG GGAATTTTTT AAAAATTATf 1620 

GATTCAATn CAHACTCGT AACTGGTCTG TTCATATTTT CTATTrcnC CGGGTTCAGT 1680 

CTTGGGAGAT TGTACATGCC TAG6AATGTG TCCGlTTCn CTAGGTTGTC CATTTTATTG 1740 

GACATGCATG GGAGCACACA GCACCGACCA GCGAGACfCA TGCTGGCTTC CTGQGGCCAG 1800 

GCTS3GGCCC CAAGCAGCAT GGCATCCTAG AGTGTGTGAA AGCCCACT6A CCCTGCCCAG 1860 

CCCCACAAH TCAHCTGAG AAGTGATTCC TTGCnCTGC ACTTACAGGC CCAGGATCTG 1920 
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ACCTGCna 6AGGAGCAGG 


GGTTTTGGCA 


GGACGGGGAG 


ATGCTGA6AG 


ccga!:ggggg 


1980 


TCCAGGTCCC CTCCCAGGCC 


CCCCTGTCTG 


GGGCAGCCCT 


TGG6AAAGAT 


TGCCCCAGTC 


2040 


TCCCTCCTAC AGTGGTCAGT 


CCCAGCTGCC 


CCA6GCCAGA 


GCTGCTHAT 


TTCaJTCTa 


2100 


CTCTCTGGAT GGTATTCTCT 


GGAAGCTGAA 


GGHCCTGAA 


GHATGAATA 


GCTTGCCCT 


2360 


GAAGGGCATG GHTGIGGTC 


ACGGHCACA 


GGAACTTGGG 


AGACCCTGCA 


GCTCAGACGT 


2220 


CCCGAGAHG GTGGCACCCA 


GATTTCaAA 


GCTC6CTGG6 


GAACAGGGCG 


CTTG"7TCTC 


2280 


CCTGGaGAC CTCCCTCCTC 


CCTGCATCAC 


CCAGTTCTGA 


AAGCAGAGCG 


GTGCTGGGGT 


2340 


CACAGCCTCT CGCATCTAAC 


GCCGGT6TCC 


AAACCACCCG 


TGCTGGTGTT 


CGGGdGGCTA 


2400 


CaATGGQGA AGGGCTTCTC 


ACTGCAGTGG 


TGCCCCCCGT 


CCCCTCTGAG 


ATCACIAAGTC 


2460 


CCAGTCCQGA C6TCAAACA6 


GCC6AGCTCC 


CTCCA6AGGC 


TCCAGGGAGG 


6ATCCTTGCC 


2520 


CCCCCGC71GC TGCCTCCA6C 


TCCTGGTGCC 


GCACCCnOA 


GCCTGATCTT 


6TAG^£GCCT 


2580 


CAGTCTAGTC TCTGCCTCCG 


TGTTCACACG 


CCTTCTCCCC 


ATGTCCCCTC 


CGTGICCCCG 


2640 


T7TTCTCTCA CAAGGACACC 


GGACAHAGA 


TTAGCCCCT6 


nCCAfiCCTC 


ACCTCAACAG 


2700 


aCACATCTG TAA/\GACCTA 


GAHCCAAAC 


AAGATTCCAA 


CCTGAAGTTC 


CCGGTGGATG 


2760 


TGAGnCTGG GGCGACATCC 


TTCAACCCCA 


TCACAGCTT6 


CA6TTCATCG 


CAAWiCATGG 


2820 


AACCTGGG6T TTATCGTAAA 


ACCCAGGTTC 


TTCAT6AAAC 


ACTGAGCTTC 


GAGGCTTGH 


2880 


GCAAGAATTA AAGGTGCTAA 


TACA6ATCAG 


GGCAAGGACT 


GAAGCraCT 


AAGCCTCCTC 


2940 


TTTCCATCAC AGGAAAGGGG 


GGCCTGGGGG 


CGGCTGGAGG 


TCTGCTCCCG 


T5A6TGAGCT 


3000 


CTTTCCTGCT ACAGTCACCA 


ACAGTCTCTC 


TGGGAAGGAA 


ACCAGAGGCC 


AGAGAGCAAG 


3060 


CCG6AGCTAG TTTAQGAGAC 


CCCT6AACCT 


CCACCCAAGA 


TGCTGACCAG 


CCAGC3GGCC 


3120 


CCCTGGAAAG ACCCTACAGT 


TCAGGGGGGA 


AGAGGGGCTG 


ACCCGCCAGG 


TCCCT3CTA7 


3180 


CAGGAGACAT CCCC6CTATC 


AGGAGATTCC 


CCCACCHGC 


TCCCGTTCCC 


CTATCCCAAT 


3240 
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ACGCCCACCC 


CACCCCTGT6 


ATGAGCAGTT 


TAGTCACTTA 


GAATGTCAAC TSAAGGCTTT 


3300 


TGCATCCCCT 


HGCCAGAQG 




CCACAGCCTG 


CTGG6TACCG ACGCCCATGT 


3360 


GGATTCAGCC 


AGGAGGCCTG 


TCCTGCACCC 


TCCCTGCTCG 


GGCCCCCTCT GTGCTCAGCA 


3420 


ACACACCCAG 


CACCAGCATT 


CCC6CTGCTC 


CTGAGGTCTG 


CAtmGCTC GCTGTAGCCT 


3460 


GAGCGGTGT6 


6AGG6AAGTG 


TCCTGGGAGA 


TTTAAAATGT 


GAGAGGCGGG AGGTGGGAGG 


3540 


TTGGGCCCTG 


TGGGCCTGCC 


CATCCCACGT 


GCCTGCATTA 


GCCCCAGT6C TGCTCAGCCG 


3600 


TGCCCCCGCC 


GCAQGGGTCA 


GGTCACTTTC 


CC6TCCTGGG 


GHAHATGA CTCTTGTCAT 


3660 


TGCCATTGCC 


ATTTTTGCTA 


CCCTAACTGG 


GCAGCAGGTG 


CTTGCA6A6C CCTCGATACC 


3720 


GACCAGGTCC 


TCCCTCGGAG 


CTC6ACCT6A 


ACCCCATGTC 


ACCCHGCCC CAGCCTGCAG 


3780 


AGGGTGGGTG 


ACTGCAGAGA 


TCCCnCACC 


CAAGGCCACG 


GTCACATG6T HGGAGGAGC 


3840 


TGGTGCCCAA 


GGCAGAGGCC 


ACCCTCCAGG 


ACACACCTGT 


CCCCAGTGCT GGCTCTGACC 


3900 


TGTCCTTGTC 


TAAGAGGCTG 


ACCCCG6AAG 


TGTTCCTGGC 


ACTGGCAGCC AGCCTGGACC 


3960 


CAGAGTCCAG 


ACACCCACCT 


GTGCCCCCGC 


HCTGGGGTC 


TACCAGGAAC CGTCTAGGCC 


4020 


CAGAGGGGAC 


nCCTGCTTG 


GCCTTGGATG 


GAAGAAGGCC 


TCCTATTGTC CTCGTAGAQG 


4080 


AAGCCACCCC 


G6G6CCTGAG 


GATGAGCCAA 


GTGGGAnCC 


GGGAACCGCG TGGCT6GGGG 


4140 


CCCAGCCCGG 


GCTGGCTGGC 


CTGCATGCCT 


CCTGTATAAG 


6CCCCAAGCC T6CTGTCTCA 


4200 


GCCCTCCALT 


CCCTGCAbAa 


LILAbAAuLA 


LuALLLLAub 


uAIAILAILu AlAflbUI iuu 




ATCCCCTGCC 


GGTGCCTCTG 


GGGTAAGCTG 


CCTGCCCTGC 


CCCACGTCCT G6GCACACAC • 


4320 


ATGGGGTAGG 


GGGTCTTGGT 


G6GGCCTGGG 


ACCCCACATC 


AGGCCCTGGG GTCCCCCCTG 


4380 


TGAGAATGGC 


TG6AAGCTGG 


GGTCCaCCT 


6GC6ACTGCA 


6AGCTGGCTS GCC6C6TGCC 


4440 


ACTCTTGTGG 


GTGACCTGTG 


TCCTGQCCTC 


ACACACTGAC 


CTCCTCCAGC TCCTTCCAQC 


4500 


AGAGCTAAGG 


CTAAGTGAGC 


CA6AATGGTA 


CCTAAGGGGA 


GGCTAGCGGT CCHCTCCCG 


4560 


AGGAGGGGCT 


GTCCTGGAAC 


CACCAGCCAT 


GGAGAGGaG 


GCAAGG6TCT GGCAQGT6CC 


4620 
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CCAGGAATCA CAGGGGGGCC 


CCATGTCCAT 


riCAGGGCCC 


GGGAGCCTTG 


GACTCCTCTG 


468Q 


GGGACA6ACG ACGTCACCAC 


CGCCCCCCCC 


CCATCAGGGG 


GACTAGAAGG 


GAC&^GGACT 


4740 


6CAGTCACCC TTCCTGGGAC 


CCA6GCCCCT 


CCAGGCCCCT 


CCTGGGGCTC 


CTGCTCTGGG 


4800 


CAGCTTCTCC HCACCAATA 


AAGGCATAAA 


CCTGTGCTCT 


cccrraGAG 


TCTTfCaGG 


4860 


ACGACGGGCA GGGGGTGGAG 


AA6T6GTGG6 


6AGGGAGTCT 


GGCrCAGAGG 


ATGACAGCGG 


4920 


Q£3aGGGATC CAGGGCGTCT 


GCATCACAGT 


CnGTGACAA 


CTGGGQGCCC 


ACAC/VCATCA 


4980 


CTGCGGQCT nGAAACTTT 


CAGGAACCAG 


GGAGGGACTC 


GGCAGAGACA 


TCTGCCAGH 


5040 


CACTTGGAGT GHCAGTCAA 


CACCCAAACT 


CGACAAAGGA 


CAGAAAGTGG 


AAAA'IGGCTG 


5100 


TCTCTTAGTC TAATAAATAT 


TGAWGAAA 


CTCAAGTTCC 


TCATGGATCA 


ATATtXCTTT 


5160 


AT6ATCCAGC CAGCCACTAC 


TGTCGTATCA 


ACTCATGTAC 


CCAAACGCAC 


TGATCTGTCT 


5220 


G6CTAATGAT GAGAGAHCC 


CAGTAGAGAG 


CTG6CAAGAG 


GTCACA6TGA 


GAAC-'GTCTG 


5280 


CACACACAGC AGAGTCCACC 


AGTCATCCTA 


AGGAGATCAG 


TCCTGGTGTT 


CATT(EAG6A 


5340 


CTGATGHGA AGCTGAAACT 


CCAATGCTTT 


GGCCACCTGA 


T6TGAAGAGC 


TGArCATTT 


5400 


GAAAAGACCC TGATGCTGGG 


AAAGAHGAG 


GGCAGGAGGA 


GAAGGGGACG 


ACAG/^TG 


5460 


AGATGGnGG ATGGCATCAC 


CAACACAAT6 


GACATGGGH 


TGGGTGGACT 


CCAGCiAGTTG 


5520 


GT6ATGGACA GGGAGGCCTG 


GCGTGCTAC6 


GAAGCGGTfT 


ATQGGGTCAC 


AAAG/vCTGAG 


5580 


TGAC7GAACT GAGCTGAACT 


GAATGGAAAT 


GAG6TATACA 


GCAAAGTGGG 


GAmTTTAG 


5640 


ATAATAAGAA TATACACATA 


ACATAGTGTA 


TACTCATAH 


TTTATGCATA 


CCTG/iATGCT 


5700 


CAGTCAaCA GTCGTATCTG 


ACTCTGTGAC 


CTATGGACCG 


TAGCCTTCCA 


GGTTTCnCT 


5760 


GTCCACAGAA HCTCCAAGG 


CAA6AATACT 


G6AGTGGGTA 


GCCATTTCCT 


CaCCAQGGG 


5820 


ATCCTCCCGA CCCAGGGATT 


GAACCGGCAT 


CTCCTGTAn 


GGCAGGTGGA 


TTCTITACCA 


5680 


CTGT6CCACC A6v-,GAAGCCC 


GIGHACTCT 


CTATGTCCCA 


CTTAATTACC 


AAAGCTGCTC 


5940 
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CAAGAAAAA6 CCCCTGTGCC CTCTGAGCH CCC6GCCTGC AGAGGGTGGT GGGGGTAGAC 6000 

TGTGACCTGG GAACACCCTC CCGCHCAGG ACTCCCGG6C CACGTGACCC ACAffTCCTGC 6060 

AGACAGCCQG GTAGCTCTGC TCTTCAAGGC TCATTATCTT TAAAAAAAAC TGAGGTCTAT 6120 

TTTGTGACn CGCTGCCGTA ACHCTGAAC ATCCAGTGCG ATQGACAQGA CCTCCTCCCC 6180 

AGGCCTCAGG GGCHCAGGG AGCCAGCCIT CACCTATGAG TCACCAGACA CTCG6GGGTG 6240 

GCCCCGCCn CAGG6TGCTC ACAGTCHCC CATCGTCCT6 ATCAAA6AGC AAGA(XAAT6 6300 

ACTTCnAGG AGCAAGCA6A CACCCACAG6 ACACT6AGGT TCACCAGAGC TGAGaGTCC 6360 

TTTTGAACCT AAAGACACAC AGCTCTC6AA GGnTTCTCT TTAATCTGSA TTrAAGGCCT 6420 

ACTTGCCCCT CAAGAGQGAA 6ACAGTCCTG CATGTCCCCA G6ACAGCCAC TCGGTBGCAT 6480 

CCGAGGCCAC TTAGTAnAT CT6ACCGCAC CCIGGAAHA ATCGGTCCAA ACTG6ACAAA 6540 

AACCTTGSTG GGAAGITTCA TCCCAGAGGC CTCAACCATC CTGCTTTGAC CACCaGCAT 6600 

ClilllllCT nTATGTGTA TGCATGTATA TATATATATA TAN I III II mTTCATTT 6660 

TTTGGCTGTG CTGGCT6TTC GHGCAGTrC GGTGCGCAGG CHCTCTCTA 6TTTCTCTCT 6720 

AGTCTTCTCT TATCACAGAG CAGTCTCTAG AC6ATCGACG CGT 6763 
(2) INFORMATION FOR 5EQ ID N0:24: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(Xl) SEQUENCE DESCRIPTION: SEQ ID N0:24: 

Arg He Arg Lys Arg 
1 5 

(2) INFORMATION FOR SEQ ID N0:25: 

(1) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(0) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

Gin Arg Arg Lys Arg 
1 5 
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CLAIMS 

1. A method for producing protein C in a 
transgenic animal comprising: 

providing a DNA construct comprising a first DNA 
segment encoding a secretion signal and a protein C propeptide 
operaOoly linked to a second DNA segment encoding protein C, 
wherein the encoded protein C comprises a two-chain cleavage 
site modified from Lysine ^Lys) -Arginine (Arg) to Ri-R2-R3-R4» 
and wherein each of Ri, R2. R3, R4 is individually Lys or Arg, 
and wherein said first and second segments are operably linked 
to additional D^IA segments required for expression of the 
protein C DNA in a mammary gland of a host female animal; 

introducing said DNA construct into a fertilized egg 
of a non-human mammalian species; 

inserting said egg into an oviduct or uterus of a 
female of said species to obtain offspring carrying said DNA 
construct; 

breeding said offspring to produce female progeny 
that express said first and second DNA segments and produce 
milk containing protein C encoded by said second segment, 
wherein said protein has anticoagulant activity upon 
activation; 

collecting milk from said female progeny; and 
recovering the protein C from the milk. 

2. The method of claim 1, further comprising the 
step of activating the protein C. 

3. The method of claim 1, wherein R1-R2-R3-R4 is 
Arg-Arg-Lys-Arg (SEQ ID NO: 20). 

4. The method of claim 1, wherein said species is 
selected from sheep, rabbits, cattle and goats. 
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5. The method of claim 1, wherein each of eaid 
first and second' DNA segments comprises an introc. 

6. The method of claim 1, wherein the second DNA 
segment comprises a DNA sequence of nucleotides as shown in 
Seq. ID NO: 1 or Seq. ID. NO: 3. 

7. The method of claim 6, wherein the second DNA 
segment comprises the DNA sequence of nucleotides as shown in 
SEQ. ID. NO: 1. 

8. The method of claim i« wherein the additional 
DNA segments comprise a transcriptional promoter selected from 
the group consisting of casein, p-lactoglobulin, a-lactalbumin 
and whey acidic protein gene promoters. 

9. The method of claim B, wherein the 
transcriptional promoter is the p-lactoglobulin gene promoter. 

10. A transgenic non-human female mammal that 
produces recoverable amounts of human protein C in its milk, 
wherein at least 90H of the human protein C in the milk is 
two- chain protein C 

11. A process for producing a transgenic offspring 
of a mammal comprising: 

providing a DNA construct comprising a first DNA 
segment encoding a secretion signal and a protein C propeptide 
operably linked to a second DNA segment encoding protein C, 
wherein the encoded protein C comprises a two-chain cleavage 
site modified from Lys-Arg to R1-R2-R3-R4* and wiierein each of 
Rl« R2« ^3' ^4' individually Lys or Arg, and wherein said 
first and second segments are operably linked to additional 
DNA segments required for expression of the protein C DNA in 
the mammary gland of a host female animal; 
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Introducing said DNA construct into a fertilized egg 
of a non-human mammalian species; and 

inserting said egg into an oviduct or uterus of a 
female of said species to obtain offspring carrying said DNA 
construct . 

12. The process according to claim 11, wherein R^- 
R2-R3-R4 is Arg-Arg-Lys-Arg (SEQ ID NO: 20) . 

13. The process according to claim 11, wherein the 
offspring is female. 

14. The process according to claim 11, wherein the 
offspring is male. 

15. A non- human mammal produced according to the 

i 

process of claim 10 . 

16. A non-human mammal o£ claim 15, wherein the 
mammal is female. 

17. A female mamntal according to claim 16 that 
produces milk containing protein C encoded by said DNA 
construct, wherein said protein C has anticoagulant activity 
upon activation. 

18. A non-human mammalian embryo containing in its 
nucleus a heterologous DNA segment encoding protein C, wherein 
the encoded protein C comprises a two-chain cleavage site 
modified from Lys-Arg to R1-R2-R3-R4, and wherein each of R^, 
R2, R3, R4, is individually Lys or Arg. 
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SUBSTTTUTE SHEET (RULE 26) 
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Fig. 2b 
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